THE BULLETIN OF 
Mathematical 


BIOPHYSICS 


JUNE 1957 


On the Equilibrium Distribution of Population in Space—Martin J.— 
TED RS SORES SPU St tee oA i i a a ae 81 


Contributions to the Theory of Imitative Behavior—N. Rashevsky - - 91° 


A Statistical Mechanics of Interacting Biological Species—Edward 


H. Kerner- - -- + °° 7** Tp eee eee ORS TD EE A ee at SR 121 
‘i Factors in Visual Acuity: |. Neural Inhibition and the Visual Perception 
a Sh iConioureno eter hi. Greene) < 608 cs rT 2 - > 147 
a ‘On the Interpretation of the Effect of Area on the Critical Flicker | 
Bese ice Mhigiiat se eiete es eS 157 


Ronny D. Landahl 


THE UNIVERSITY OF CHISAGO PRESS - CHICAGO 


By +: 
. VOLUME 3 apapeagee NUMBER 2 
oo i= wee 
as . 1 gs h> % os ct sea iG az ; \ 
Me) eileen ree 
este rage 


The Bulletin of 
MATHEMATICAL BIOPHYSICS 


Eprrox: 


~ N. RASHEVSKY, UNIVERSITY OF CHICAGO 
AssoctaTEe EpitTors: 
H. D. LANDAHL, UNIVERSITY OF CHICAGO 
ANATOL RAPOPORT, UNIVERSITY OF MICHIGAN 


The BULLETIN is devoted to publications of research in Mathematical 
Biology, as described on the inside back cover. 


THE BULLETIN is published by the University of Chicago at the University 
of Chicago Press, 5750 Ellis Avenue, Chicago 37, Illinois, quarterly, in March, 
June, September, December. {| The subscription price for the United States is 
$10.00 per volume; the price of single copies is $3.25. Orders for service of 
less than a full volume will be charged at the single-copy rate. The subscription 
price for Canada and Pan America is $10.50 per volume; for all other countries 
in the Postal Union, $11.00 per volume. {[ Patrons are requested to make all 
remittances payable to the University of Chicago Press in postal or express 
money orders or bank drafts. 


THE FOLLOWING is an authorized agent: 


For the British Commonwealth, except North America, India, and Australasia: 
The Cambridge University Press, Bentley House, 200 Euston Road, London, N.W. 1. 
Prices of yearly subscriptions and of single copies may be had on application. 


CLAIMS FOR MISSING NUMBERS should be made within the month following 
the regular month of publication. The publishers expect to supply missing numbers 
free only when losses have been sustained in transit, and when the reserve stock 
will permit. 


BUSINESS CORRESPONDENCE should be addressed to the University of 
Chicago Press, Chicago 37, Ill. 


COMMUNICATIONS FOR THE EDITOR and manuscripts should be addressed 


to N. Rashevsky, Editorial Office of The Bulletin of Mathematical Biophysics, 
5741 Drexel Avenue, Chicago 37, Ill. 


NOTICE TO SUBSCRIBERS 


If you change your address, please notify us and your local postmaster 
immediately. 


[Copyright 1957 by the University of Chicago] 


Permission to reproduce for purely scientific or scholarly purposes any material 
published in this journal will be given upon request. 


Re-entered as second-class matter April 6, 1956, at the Post Office at Lancaster, 
Pennsylvania, under the Act of March 3, 1879. 


195 


ON THE EQUILIBRIUM DISTRIBUTION OF POPULATION 
IN SPACE 


MARTIN J. BECKMANN* 


COWLES FOUNDATION FOR RESEARCH IN ECONOMICS 
YALE UNIVERSITY 


Spatial equilibrium distributions of population are derived from the 
spatial distribution of net rates of reproduction, and from a relationship 
between migratory flow and gradients of population density and of loca- 
tional ‘‘attractiveness.’’ Conditions are discussed for which population 
approaches a uniform spatial density. Under certain conditions a particu- 
larly simple statement of the equilibrium conditions is possible in terms 
of the ‘‘potential of population,’’ a concept introduced by demographers 
(J. Q. ‘Stewart, Geographical Review, 37, 46-85, 1947) to measure the 
proximity of a point to people. 


The problem of the distribution of population in space is of ob- 
vious interest and has attracted some attention in the literature. 
Empirical regularities concerning frequency distributions of city 
size originally discovered by G. Zipf (1941) have long posed a 
challenge to theoretical analysis. This has been met by N. Rashev- 
sky (1947, pp. 93-107) and H. Simon (1955) among others. Another 
avenue of approach seemingly disconnected has been in terms of the 
potential of population, a concept modeled after Newtonian physics 
(Stewart, 1947). In this paper we propose to approach the equilib- 
rium distribution of population in space in terms of a flow model of 
migration. With more specific assumptions this leads to a notion 
of the potential of population. No attempt will be made however to 
derive a frequency distribution of population density. It will ap- 
pear that in our model such frequency distributions are ultimately 
_ due to the differential endowment of locations with economic op- 
portunities, something we do not propose te explain here. 


*This paper was first written at the Center for Advanced Study in the 
Behavioral Sciences. 


81 


82 MARTIN J. BECKMANN 


Consider an economic region in a stationary state of equilibrium. 
At different locations, the densities of population and the per 
capita real incomes will, in general, be different. Partly as a re- 
sult of this, partly for exogenous reasons, the reproductive rates of 
population will also differ within the region. In order to maintain 
the existing distribution of population and income, the surpluses 
and deficits in reproduction must be balanced by migration. Sup- 
pose that net immigration into the region as a whole is zero, so 
that we are concerned with internal migration only. To maintain a 
steady flow of migration, incentives must prevail in the form of 
interlocal differences in per capita real income or attractiveness 
of location. Thus equilibrium implies a complicated balance of in- 
comes, population densities, reproduction rates, and migration flows. 
Under certain conditions the income level may tend to be uniform 
at all locations. But in general a more complicated state of equi- 
librium will result. The purpose of this paper is to develop the 
equilibrium conditions and solve them in the simpler cases. 


1. Preliminaries on Population Density and Income. Both income 
and population density will appear as motivating forces in the 
generation of population deficits and surpluses and in the orienta- 
tion of migration. But they are not independent variables. Given the 
economic opportunities at a location, the level of per capita income 
in real terms is a function of population density. Toa smaller extent 
it also depends on the distribution of population densities outside 
the location considered. While we shall disregard the direct in- 
fluence of these on the income level at a given location, it will 
appear that indirectly a balance exists between income level and 
the distribution of population around a given location. 

The relationship between income level and population density at 
a given location may be assumed to follow the law of diminishing 
returns. That is, above a certain level per capita income is a de- 
creasing function of population density. For simplicity, we shall 
sometimes use a linear approximation. 
Let u, v be locational coordinates, 

p (u, v) the population density, 
y (u, v) the per capita (real) income level; 

then the economic opportunities at a location are expressed by a 
function f(p, wu, v) 


yu, r) =flp(y v), uy, aI, (1.1) 


DISTRIBUTION OF POPULATION IN SPACE 83 


which is decreasing with respect to p. Assuming f to be differ- 
entiable, we have 


a aod (1.2) 


2. Regeneration of Population. What determines the surpluses and 
deficits of population before migration at different locations? We 
must distinguish for each location: 

2.1 The equilibrium rate c of reproduction. This is the annual 
rate of reproduction required to sustain the given level of popula- 
tion in the reproductive age bracket. We shall disregard any dis- 
turbing effects of migration on the age composition of local popula- 
tions. This is not a very unrealistic assumption as long as we 
restrict the notion of population to the reproductive ages. The 
equilibrium rate c depends only on mortality. Inasmuch as mortal- 
ity differs among locations, c is a function of the locational 
coordinates 


c = c(u,v). 


To some extent mortality depends also on income levels and pop- 
ulation densities. But we shall usually disregard this complication. 

2.2 The actual (net) rate of reproduction g. This depends on 
various factors which we shall classify as: 


population density p 
income level y 
location U, V 


The last is a catch-all for factors (such as attractiveness of the 
location) not absorbed in either population density (such as degree 
of urbanization) or income (such as economic opportunities). 

Nothing can be said in general about the dependence of g on the 
locational coordinates. Of special interest is, of course, the case. 
in which the rate of reproduction is uninfluenced by the location. 
Then the reproductive rate is the same function of population den- 
sity and income at all locations. 


g=9(P,9)- 


We consider next the relationship between rate of reproduction and. 
population density. At very low densities of population any in- 
crease in density may be expected to raise the rate of reproduction 


84 MARTIN J. BECKMANN 


because of the increase in the number of contacts. From a certain 
level on, however, urbanization makes the rate of reproduction de- 
crease with population density. As this effect grows stronger with 
rising population density, a point is finally reached from which on 
even the absolute level of reproduction (rate times density) de- 
creases with any further increase in population density. We may 
think of this as a Malthusian equilibrium. 

The effect of income on the reproduction rate is less transparent. 
A clear distinction must be drawn between the correlation of in- 
come and reproduction for a cross-section of a local population at 
a given level of average income, and the response of the aggregate 
rate of reproduction of this local population to a rise in average 
income. It is with the latter only that we are concerned here. 
There is evidence that ‘‘family size and income often tend to be 
positively associated within groups that are otherwise homo- 
geneous’’ (Spengler, 1952, p. 101). 

2.3 Annual population surplus e. This is the net of actual and 
equilibrium rate of reproduction multiplied by the population den- 
sity. Substituting y = f(p,u,v) from (1.1) we see that the rate of 
population surplus may be regarded as a function of population 
density and location alone: 

e= e(p, u,%). 
We shall be interested in Malthusian types of equilibria which are 
attained when population density pushes against the limits of re- 
source availability. An increase of population density then de- 
creases the annual population surplus either directly or indirectly 
through a fall in the income level. 


3. Forces Directing Migration. We propose to consider migration 
as the net outcome of the random movements of many persons sub- 
ject to an external field force. This force is assumed to be de- 
rived from a potential. Some alternative assumptions about its 
nature and composition will be examined. 

In the simplest case, that is, in the absence of all external 
forces, a pure diffusion process* is present. Concentration itself 


*An approach to human migration in terms of a diffusion process was 
made as early as 1921 by H. Hotelling (unpublished master’s thesis 
University of Washington, cited in Hotelling, 1949). The dynamics of 
the spatial expansion of biological populations has been analyzed in 
terms of random diffusion in an important paper by J. G. Skellam (1951). 


DISTRIBUTION OF POPULATION IN SPACE 85 


assumes then the character of a repelling force. That is to sav the 
vectors of net flow are everywhere a fixed proportion of the gradient 
of concentration. Population will thus diffuse from areas of high 
density into those of lower density. We denote by 
(uv, v) the flow vector, whose direction is that of the flow and 
whose length equals the flow density 


D the coefficient of diffusion 
and obtain 


P(u, v) = —D grad P(u, v). (3.1) 


In general Qis proportional to the vector sum of the diffusion 
force and the external force (Bjerknes, 1933, p. 117, equation 1). 
Let a(u, v) be the potential of the external force. Then 


P(u, v) = grad a(u, v) - D grad p(w, v). (3.2) 


The potential a(u, v) itself may be interpreted as an index of 
the desirability of a location u, v. The assumption next in sim- 
plicity to a(u, v) = 0 is that a(u, v) be given a priori. It is then a 
function of location only. On the diffusion movements are now 
superimposed flows which are induced by the given differences in 
the attractiveness of the locations. 

From an economic point of view it is more satisfactory to let the 
attractiveness of a location be influenced by its (real) income level, 
for prices and incomes are the forces by which demand is brought 
in line with supply. As applied to the present case this means 
that incomes play a compensatory part in attracting people to loca- 
tions which are otherwise less desirable but in need of personnel. 
In general we shall assume therefore 


a= a(u, v)+a,ly(u, r)], : (3.3) 


where a, is a given function, monotonically non-decreasing. In the 
simplest case the index of attractiveness may be so normalized 
that it is simply additive to income, resulting in a composite index 
of attractiveness 


a= 4, (u, %) + ¥(U, X). 


4. The Equations of Equilibrium. The only relationship needed to 
complete a system of equilibrium conditions is one equating the 
(annual) population net surplus with the (annual) rate of net emigra- 


86 MARTIN J. BECKMANN 

tion. This is supplied by the well-known source-sink equation or 

‘“‘equation of continuity’’ in hydrodynamics. (Kellog, 1929, p. 48) 

div 9(u, v) = g(u, v). 

ee 

Here div denotes the divergence operator pres SE and g(u, v) is 
u dv 

the rate of net yield from sources or sinks located at (u, v). 

Taking the most general case we now have the following system 


of equations describing the equilibrium inside the region under 
consideration. 


é (DY, U2) = div P(u,%), (4.1) 
y(u,v) = f(p,¥,2), (4.2) 
y(u,v) = -—D grad p(u,v) + grad a(y,u,v). (4.3) 


To complete the picture, immigration into the region across the 
boundaries I‘ must be specified. This means that the normal com- 
ponent 9, of the flow vector must be given at all points of the 
boundary. For simplicity we shall assume that immigration is 
ZeLO, 


P, (u,v) =0 (u,v)e T° (4.4) 


The equation system may be reduced to a single equation for the 
interior of the region. First, y may be eliminated by substitution 
from (4.2). Next using the identity 

g2 0? 


j = See 
div grad =A Fae oe: 


we may combine (4.1) and (4.3): 
-DA p+Aal[f(p,u,v),u,v] = e [p,f(p,u,v),u v1]. (4.5) 
The boundary conditions also may be expressed in terms of p: 
D (grad p),, = (grad alf(p,u,r),u,v]),« (4.6) 


Solutions of this second-order partial differential equation of 
the boundary value type will now be discussed for alternative 
specifications of the data functions a(y,u,v), e(p,y,u,v), and 
f(P,U,%)- 


5. Homogeneous Region With No Migration at Boundaries. Suppose 
that there are no inherent differences between locations of the re- 
gion. That means that the locational coordinates as such do not 


DISTRIBUTION OF POPULATION IN SPACE 87 


enter into the functions a,e,f. We will show that under mild as- 

sumptions the only equilibrium possible is one in which population 

is distributed at a uniform density and in which migration is absent. 
The equations of equilibrium have the form 


~DAp + Aalf(p)] = elp,f(p)], (5.1) 
(-D grad p + grad a) =0. (5.2) 


Write -Dp + a= A; let A, be an unspecified constant; and apply 
Green’s First Identity (Kellog, loc. cit. p. 38) to 


Sf (A-A,) A (A-A,) dude which yields 


= §(A - A)) (grad A), ds - ff (grad A)? dudv. (o)) 


Because of the boundary condition, the first term on the right-hand 
side vanishes, Substituting e[p,f(p)] for AA on the left-hand side 
we obtain 


—-Sf (grad A)* dudv = ff [A(p) - A,] elp,f(p)] dudv. (5.4) 


Now A is a decreasing function of p, since 
eee Sd = 0. by (8.3) 
dp dy dp dy 


and a < 0 by assumption (1.2). 
P 


On the other hand e[p,f(p)] is also decreasing, if we assume that 
either 


de de af 


ee ay pa <a) t —<0 of 
Op Oy dp 

9 Lg 9254 Ff ~ 9 

dp oy dp 


that is to say when we are at a point where the annual surplus of 
population decreases with population directly or at least indirectly 
because of the unfavorable effect on income of @ population rise 
(cf. section 2.3). We now specify A, = A(p,) where e[p,,f(p,)] = 0. 
Then 

3 [A(p) - Aj] elp, f(p)1 29, (5.5) 
for all p and ‘‘=”’ only if p = p,. Since the right-hand side of (5.4) is 
therefore non-negative, and the left-hand side non-positive, we 
conclude that p = p,- 


88 MARTIN J. BECKMANN 
We state as a conjecture here that the equilibrium is unstable 
under any alternative assumptions about e and f which would make 


A(p) an increasing function. For this would permit an unchecked 
increase of population at any location. 


6. Linear Case. Let the annual population surplus be linearly de- 
pendent on the population density at a location, with a uniform 
slope coefficient —e,, e, being independent of position. 

elp, f(p),u,vl = eo(u,v) - e,p. (6.1) 


This implies that the net rate of reproduction is a function of the 
form 


z Jo (u, v) 
P 


where g, is a constant. For small densities of population, (in 
rural areas) the rate of reproduction is positive; for large densities 
it is negative. The equation of equilibrium has the form 


Ta Gotu 


—DAp + Aalf(p,u,v), u,v] = e, (u, v) — e,p. (6.2) 


Consider now the special case of an unbounded region, whose pop- 
ulation density vanishes at distances approaching infinity. For 
the following argument it is mathematically convenient to regard 
the distribution of population and the flow field as three dimensional, 
with coordinates u,v,w. Any limits one wishes to place on the ex- 


tension into the third dimension are conveniently imposed through 
the densities @ and e. The differential equation of population 
equilibrium remains unchanged if the operations div and grad are 
used in their three-dimensional meaning. The differential equation 


(6.2) may now be transformed like a Newtonian field equation 
(Kellog, loc. cit., p. 156). 


(U,V,W) — e,p(U,V,W) 
-Dp+a= ins: f f fete ceat Ro - 
p see UdVadW, (6.3) 


where r? = (u— U)? + (v-V)? +(w —W)?. 


Write now 
U,V,W) 
P(u,v,w) = pth ye Cs P d 
( ) r(u,v,w;U,V,W) PAbSE 


Q(u,v,w) = — ag fe [ee : ae a dUdV WwW. 


DISTRIBUTION OF POPULATION IN SPACE 89 


The first expression is known in the literature as the potential 
of population. It is considered an inverse measure of the proximity 
of a point to people. A more ambitious interpretation regards it as 
the demographic counterpart of the Newtonian potential of mass 
attraction (Stewart, 1947). The second expression, which is not 
particularly noteworthy, is independent of the actual distribution 
of population and may therefore be considered a datum for each 
location. 

The transformed equilibrium equation is thus 


p (u VU u) = a( ) a Uu w)- — Q U,V,W 6.4 
3v,u) = — U,V,u) + — P ,V,w )V,W). i 


It states that under certain assumptions the equilibrium density of 
population is everywhere an increasing linear function of the op- 
portunity density and the potential of population. 

As Professor H. D. Landahl has pointed out to me, if e repre- 
sents a ‘‘logistic’’ birth-death rate process so that e = ap — Bp”, 
then we can write instead of (6.4) 

a B Q 


ee eed ea 


a 
Me Dp Rosie? 


where P, is a “‘first order’’ potential, P, is a ‘“‘second order’’ or 


‘second moment’’ potential such that if « and B depend on posi- 
tion, in general, 


1 
Date em pavay ay 
4n , 
es B(U,V,W) p? dUdVdW 
PB = = ARIES ALS LEE oe 
7 T 


The empirical observation that in rural areas population density 
appears to be proportional to the square of the potential of popula- 
tion (Stewart, Joc. cit.) rather than being linearly related to it 
suggests that the mobility of rural population is different from that 
suggested by a pure diffusion model. It is tempting to introduce 
differential degrees of mobility into the model by letting flow be 
proportional to the gradient of some power, with exponent less than 
unity, of population density. An exponent of 4 would be compatible 
with the observed rural density distribution, provided that a and Q 
would roughly cancel out. However, such special hypotheses shall 
not be further pursued here. 


90 MARTIN J. BECKMANN 


In revising this paper, I have benefited from penetrating com- 
ments by Professors N. Rashevsky and H. D. Landahl, for which I 
express my gratitude. 


LITERATURE 


Bjerknes, V., J. Bjerknes, H. Solberg, and T, Bergenson. 1933. 
Physikalische Hydrodynamik. Berlin: J. Springer. 

Hotelling, H. 1949. ‘*Stochastic Processes, Historical Summary of the 
Problem.’’ (abstract) Econometrica. 17, 66-8. 

Kellog, O. D. 1929. Foundations of Potential Theory, F. Ungar: New York. 

Rashevsky, N. 1947. Mathematical Theory of Human Relations, Bloom- 
ington, Indiana: Principia Press. 

Simon, H. 1955. ‘*On a Class of Skew Distribution Functions.” Bio- 
metrika, 42, Parts 3 and 4, 425—40. 

Skellam, J. G. 1951. ‘*Random Dispersal in Theoretical Populations.’’ 
Biometrika, 38, 196-217. 

Spengler, J. J. 1952. ‘Population Theory.” A Survey of Contemporary 
Economics, Vol. 2, pp. 83-131. (B. F. Haley, Ed.) Homewood, II1.: 
R. D. Irwin. 

Stewart, J. Q. 1947. ‘*Empirical Mathematical Rules Concerning the 
Distribution and Equilibrium of Population.’? Geographical Review, 37, 
46—85. 

Zipf, G. K. 1941. National Unity and Disunity. Bloomington, Indiana: 
Principia Press. 


RECEIVED 12-19-56 


CONTRIBUTIONS TO THE THEORY OF IMITATIVE BEHAVIOR 


N. RASHEVSKY 
COMMITTEE ON MATHEMATICAL BIOLOGY 
THE UNIVERSITY OF CHICAGO 


The imitation effects in a social group depend both on the size of the 
group and on the distribution of a certain psychobiological quantity @ 
which measures the tendency of an individual towards a given behavior. 
The distribution function of ¢ determines the ratio p of the individuals in 
the society who adopt a given behavior. When the size of the social group 
is not too large, the actual distribution of d will deviate from the most 
probable one, and therefore communities of the same size and having the 
Same parameters may have different values of y. Approximate equations 
are developed which give the probability of a given p for a group of a 
given size. Possible effects of interactions of communities of different 
sizes are briefly discussed. A generalization of the theory of imitative 
behavior to any number of mutually exclusive behaviors is given, and its 
possible sociological implications are discussed. 


I 


In a number of previous publications (Rashevsky, 1949a,b,c, 
1950b, 1951a,b, 1953), we have studied the effects of imitation on 
the behavior of large groups of individuals. With the exception of a 
brief outline of possible generalizations (Rashevsky, 1951a; herein- 
after referred to as MBSB; chap. xiii), our studies were confined to 
a rather oversimplified special case in which the individuals of the 
social group have the choice of only two mutually exclusive be- 
haviors which for convenience we shall denote by 2; and Ro. Dif- 
ferent individuals have different degrees of preference toward FR 
or R2, and this determines their behavior. The difference in pref- 
erence towards R, or R2 has been expressed in terms of a quantity 
¢@ which measures the difference in the excitation of two not more 
closely described central mechanisms, one of which is responsible 
for behavior R;, the other for R2. When the two excitations, e, and 
€5, are equal, then @ = €; — €g = 0, and the individual does not have 
any preference for either &; or Ro. He will either abstain from 


91 


—— = = 


92 N. RASHEVSKY 


both activities or perform them equally frequently. If é = «; — eg > 9, 
then the individual prefers the activity ®,;. He will perform #1 
more frequently than Ry. The relative frequency of #1, as com- 
pared with R2, increases with ¢, and the relation between this 
relative frequency and ¢ is given by H. D. Landahl’s (1938) theory 
of psychophysical judgment. When ¢ =e; — €, < 0, then the indi- 
vidual prefers activity R2, and the relative frequency of 2 in- 
creases as ¢ decreases. For ¢=- only R; is performed; for 
g =-, only Re. Up to a scale factor, d may be determined by 
psychophysical methods, from the observed relative frequencies of 
R, and Re, using Landahl’s equations (MBSB, chapters ii and iii). 

The above describes the behavior of an individual in a society if 
this individual is not influenced by the behavior of others. Under 
those conditions, the distribution function N(d) of ¢ in the society 
determines the fraction of individuals which at any given time per- 
form behavior %; or correspondingly Ra. If N(d) is symmetrical 
with respect to 6 = 0, so that N(¢) = M(—¢), but otherwise arbi- 
trary, then the number of individuals who at any given time perform 
activity ®, is equal to the number of individuals who perform ac- 
tivity R2. There is, however, a constant shifting of behavior be- 
tween individuals so that we have here a dynamic equilibrium. 

If, however, the individuals in a society imitate each other, the 
Situation is changed. The sight by a given individual of another 
individual, who performs activity ®,, increases the preference of 
the given individual towards F,, in other words, increases ¢. On 
the other hand, the sight of another individual, who performs Ra, 
decreases the value of ¢ of the given individual. Under those con- 
ditions, even when M(d) is symmetric with respect to 6=0, a 
slight accidental shift of the ratio of individuals, performing FR, 
and 2, may result in a further shift of that ratio so that the num- 
ber of individuals performing, let us say, activity R; increases at 
the expense of those performing R2. Eventually a stable situation 
is reached in which by far the largest majority of individuals per- 
form either &; or Ra. If N(d) is symmetric, then the outcome as to 
whether the majority will perform R,; or Ro is determined by pure 
chance. Thus as a result of imitation, a society, in which the 
average ‘‘natural’’ preference towards either R; or Ro is zero, may 
perform preponderantly either R; or Rg, as the case may be, due to 
the effects of imitation. This may be considered as one of the 
characteristics of mass or ‘‘mob’’ behavior. 


THEORY OF IMITATIVE BEHAVIOR 93 


Another important limitation in previous studies was that we 
considered a uniform spatial distribution of individuals. We did 
introduce the effect of distance between two individuals upon their 
influence on each other (Rashevsky, 1950b), but the density of 
population was always considered as constant. This, however, 
never happens in any society, even approximately. The individuals 
in any society are distributed in cities, towns, villages, and other 
communities of different sizes. We may expect this to have very 
pronounced effects on imitative behavior. The purpose of this 
paper is to investigate some of those effects and then to point out 
the possible relations between our conclusions and some actual 
Situations. 

The ratio of numbers of individuals which perform the activities 
R, and Re is determined by the shape of the distribution function 
N(¢) as well as by some other biological parameters. In our pres- 
ent study we shall lxnit ourselves to the case in which those other 
parameters are constant characteristics of the individuals and are, 
therefore, not affected by non-uniformities of spatial distribution. 
We shall investigate the possible effects of this non-uniformity on 
the shape of the function N(¢). 

If No, which we suppose to be very large, is the total number of 
individuals in the society, then 


NoN(d)dd (1) 


is the number of individuals whose ¢ lies between ¢ and ¢ + d¢. 
Or we may say that N(¢)d¢ is the probability that any individual 
picked at random has a ¢ between ¢ and ¢ + dd. Then, if No is 
the number of ‘‘trials,’’ the actual number of individuals for very 
large values of No will be given by (1). 

Let now the whole population be distributed in communities of 
different sizes, and let us at first assume that those communities 
are independent of each other. Y 

Inasmuch as ¢ is a purely biological parameter, the probability 
N(¢)d¢ is also constant in a given society and is independent of 
the spatial arrangement of the individuals. Consider, however, a 
community with a population No, which is appreciably smaller than 
No. If No is not too large, then the number of individuals with a d 
between ¢ and ¢ + d¢ (or as we shall say for brevity ‘with a pos 
é’’) will fluctuate around the value NoN(¢)d¢, those fluctuations 
being the larger, the smaller Nj. Therefore the actual number of 


94 N. RASHEVSKY 


individuals with a given ¢ will be given for not too large values of 
No not by NoN(d)d¢ but by 

Now d)de, (2) 
where u(#) is a function which is slightly different from N(¢). 

In other words, for very large values of Ny, the actual distribu- 
tion of d coincides with the most probable one; for smaller values 
of No, the actual distribution is different from the most probable 
one. It.is the actual distribution that determines the ratio p of the 
number of individuals who perform behavior 7, to the number of 
those who perform Ry. Therefore that ratio » may be different in 
different small communities of equal size. 

The value of y is a function of No (MBSB, chapters xii and xiii). 
Hence in the absence of fluctuations the distribution function of yu 
in a society which consists of a large number of communities of 
different sizes would be determined by the distribution function of 
the community sizes in a very simple manner. The result of the 
fluctuations, however, is that sufficiently small communities of the 
same size will have in general different values of ». Therefore the 
relation between the distribution function of community sizes and 
the distribution function of 1 becomes more complicated. 

However, the deviation from the most probable value NyN(¢)dd 
is not fixed for a given No, but itself fluctuates according to the 
laws of probability. The problem arises to determine for a given No 
the probability that the actwal distribution will lie between the 
limits u(d)d¢ and u(d)dd + du(d)d¢, if the most probable distribu- 
tion is given by N(¢)d¢. This leads to a rather interesting problem 
in functional analysis. Since we have not found the solution of 
that problem, we shall only formulate it and outline a tentative ap- 


proach. After that we shall use an approximate method which will 
give us the desired results. 


II 


If n is the number of trials and if p is the probability of a given 
event to occur during any one of those trials, if qg=1-p and if 
¢; and ¢2 > ¢; are two numbers, then, according to Laplace’s theo- 


rem, the probability that the actual number m of occurrences of the 
event in the n trials lies between 


pnt tivenpg and pn + ty/2npq (3) 


THEORY OF IMITATIVE BEHAVIOR 95 

approaches 
7 pla 
= e7*" daz, (4) 
Vi. 


ty 


If p is very small, then gis very close to1. Lett, = ¢; t) =¢+ dt, 
dt>0. Put 


t/2npq = A; (5) 
then 
(¢ + dt) /2npqg=A+ dA. (6) 
Putting g = 1, we see that the probability of m being between 
pn+A and pn+(A+dA) (7) 
tends with increasing n to 
1 a 
—er-* dt. 8 
ra (8) 
But from (5) we have 
A dA 


t = ; dt= (9) 


V2pn V2pn ; 


Hence, introducing this into (4), we find that the probability of m 
being between pn + A and pn + (A + dA) approaches the value 


Le 


P(A)dA = —1__¢7 7p dA (10) 
nnp 


as n increases. But P(A)dA is nothing else but the probability that 
the actual number m of occurrences will deviate from the most prob- 
able number np by an amount between A and A + dA. 

Though P(A)dA is a limiting probability, yet expression (10) may 
be used as an approximation for the actual probability that the 
number m of occurrences deviates from the most probable average 
number np by an amount between A and A + dA. 

Expression (10) is applicable as an approximation when 


np >> 1. (11) 


For values of np which are of the order of 1, Poisson’s expression 
must be used, which holds for the case that with increasing n the 
value of p decreases so that 


np = a= const. (12) 


96 N. RASHEVSKY 


In that case the probability P(m) that in n trials the event occurs 
exactly m times is given approximately by 


P(m) =“, (13) 
Mm: 

In all the applications beginning with section III we shall ex- 
plicitly limit ourselves to the case where inequality (11) holds and 
shall therefore use equation (10) throughout. 

What we need, however, for an exact treatment of the problem is 
the probability P of a distribution lying between u(¢) and ud) + du), 
for a sample of size N,, if this distribution function for very large, 
theoretically infinite, populations is given by N(¢d). One way of 
approaching the problem would be to subdivide the whole range of 
@ into finite intervals and apply to each interval expression (10). 
In this case, however, the transition to a continuous distribution 
presents certain difficulties. 

It may seem natural to think of expressing the desired probabili- 
ties in terms of the distance 


+00 


| [N(d) — ud)]2deb (14) 


—co 


between the functions N(¢) and ¢) in the functional space. Prob- 
ability distributions of the distances of the cumulative function 
have been derived, But those expressions give us only the prob- 
ability of a class of functions, characterized by a given distance 
from N(¢). 

Suppose, however, that we have a solution of our problem. As 
we said above, it is ud) which determines the ratio » of the num- 
bers of individuals who perform activities ?, and Rj. But while a 
given ud) determines wniquely p, the inverse does not hold. There 
is an infinite number of functions u(d) which give the same p. This 
is seen from equation (20) of chapter xii of MBSB. This expres- 
sion gives the difference X — Y of the numbers of individuals which 
perform activities R, and R. Inasmuch as, in chapter xii of MBSB, 
N(é)d@ denoted the number of individuals with a given ¢, which 
here is given by Nowd)dd, therefore in our present notation we 


have 
+ 


X SY eNy / u(b) [2P 1(h,¥) - 11dd, (15) 


co 


THEORY OF IMITATIVE BEHAVIOR 97 


where the notations are the same as in MBSB. Denote the integral 
in (15) by F(w). Then, since 


X+Y=No, (16) 
we find: 
ae sel 1+F(d)); Y= 1 - F(p)). (17) 
Hence en = 
Pitt F(w)’ 


As we have seen in MBSB, chapter xii, F(w) is an S-shaped 
curve with two asymptotes, F(—~) =-1,:F(+~0)=+1. For sym- 
metric N(d), F(0) = 0. 

The value of » in the stable state is given by the value of X/Y 
which corresponds to a particular value of y* of u. This value y* 
is the root of the equation 


NoF(b) - ab = 0. (19) 


The value &* may be expressed in terms of No and uw in the fol- 
lowing manner. When (19) is satisfied, then ay is equal toX* -— Y*, 
where X* and Y* are the stable equilibrium values of X and Y, 
Hence 


ays* = X* - Y*, (20) 
Together with (16) this gives: 
x* - No + ay" , ae No = ap” (21) 
2 2 
or 
— ee No <5 af” (22) 
Y* No — ap 


Hence, for a given » we have a definite w* given by (22). Accord- 
ing to (18), with this value of y", the ratio » is also given by 


7 1+ 1+F(y"* ie (23) 
1-F(L*)” 
Hence, for a given p, the integral, 
r(y*)={  WS) BPW") ~ Med, (24) 


must have a fixed value. This value, as seen from (23), is equal to 


98 N. RASHEVSKY 
(u -1)/(u +1). But since 
+00 


J 2u(b)P(b, "dd = ae (26) 


ed w+ 


“ahd ere (25) 


co 


therefore we must have 


In equation (26) P(d,*) is a given function, determined by Lan- 
dahl’s theory of psychophysical discrimination. Any function (¢) 
which satisfies (26) will give the same value of ». There is, ‘in 
general, an infinite number of such functions. 

Suppose that we know the expression for the probability that the 
actual distribution function of ¢ lies between ud) and ud) + dud). 
This probability is a functional of wd). To obtain the probability 
that in a population of No individuals the ratio » lies between pz 
and » + du, we must integrate the above functional over all the 
forms of ud), which satisfy (26). 


Ill 


Pending the solution of the above outlined problem, we shall ap- 
proach the whole situation from a different angle. In MBSB, chap- 
ter xiii, we pointed out that the theory of imitative behavior based 
on a continuous distribution function N(d) may be simplified by 
considering in the first approximation a discontinuous case of 
three categories of individuals. The majority of all individuals 
may be considered as having no preference for either R; or Rp. We 
shall call them the passive or indifferent individuals. Their num- 
ber is denoted by N’%. A small number Xp of individuals has such a 
strong preference for FR, that it performs only that activity. An- 
other relatively small number Yo of individuals performs only Ro. 
We thus have 


No = N’+ Xo - Yas (27) 


Of the N’ passive individuals a number X will perform activity 


F,, and a number Y, activity Rj. We then ask how X and Y, and 
therefore the ratio, 


X 
ode: y’ (28) 


THEORY OF IMITATIVE BEHAVIOR 99 


are determined by the quantities Xo, Yo, and No. Those three 
quantities very roughly determine a sort of step-shaped distribution. 
As was pointed out in chapter xiii of MBSB, the correspondence 
between the function N(d) and the distribution of Xo, ‘Yo, and No 
is approximately established by the relations: 
=% ey Ae Sabet a 
Ny Ny 
where o is the standard deviation of N(d). 
As has been shown in MBSB, chapter xiii, the ratio » is given by 


aoXo - coYo - (a = coe’ Y9)N’ 
“ST i EER ar See eee 
COYo — 40Xy — (a — akeX)N’ 


= 


(29) 
The meaning of the constants a§, c}>, a, «, and ¢’ has been ex- 
plained in MBSB. 
The ratio » changes continuously with Xo and Yo, up to a certain 
range. For 
Ee C5 Yo — aN’ 


a3 =v)’ vat 


0 
the denominator of (29) becomes zero, and p=. This means that 
Y = 0 and that all passive individuals perform behavior Ry. If Xo 
is greater than the right side of (30), then p <0, which is physi- 
cally impossible. The meaning of this is that for values of Xo 
which are equal to or greater than the value given by (30), p=, 
or Y=0,X=M. 
Hence 
co Yo — aN’ 


antl Ny et 


p=eo when X.$ 


Similarly by equating the numerator of (29) to zero we find that 


co (1 -— N’) Yo + aN’ 
pods Oh) oitat. é : : (32) 


w=0 when Xp) < 
49 


In this case X=0; Y=N% All passive individuals perform be- 
havior F,, 

Just as we considered N(¢) to be a biological characteristic of a 
given society, so shall we consider the ratios Xo/No and Yo/No to 
be a biological characteristic of a given society. Let « be the 
biologically determined probability that a randomly chosen individ- 
ual is active and prefers R,; let 8 be the probability that @ ran- 


100 N. RASHEVSKY 


domly chosen individual is active and prefers RF. Then for very 
large values of Ny we shall have: 


Xo = 4No; Yo =BNo- (33) 


If,however, No is not too large, then the numbers of actives of 
type F, and of type R, will deviate from the most probable values 
(38). Therefore the ratio » = X/Y will fluctuate, and different 
communities of the same size will have different values of p, even 
though the parameters ao, cg, a, e, and ¢’-are constant for all of 
them. The constancy of those parameters is explicitly assumed in 
this paper. 

We shall first present a general method of treating the problem 
in which we are interested and then illustrate it by applying to the 
case of expression (29). 

Instead of considering-simple linear differential equations for X 
and Y, as has been done in MBSB, chapter xiii, we may make more 
general and complex assumptions. Instead of (29) we then shall 
obtain some more complicated expressions. In general we shall 
have 


= F(X, Yo, No), (34) 


where F(X),Y,,No) is a function of the three variables and of the 
constant parameters Qos 25% a, €, and e’. 

For sufficiently large values of No, we may substitute (33) into 
(34), and obtain: 


ut = F(ANo,BNo, No). (35) 


Thus in this case p is completely determined by No, a, and B. 
Those last two quantities thus give us the characteristics of the 
simple society considered here. 

When, however, Vy is not too large, then the values of X, and 
Y, may deviate from those given by (33). We shall have in general 


Xo = No +Ay; Yo =6No + Aa. (36) 


The probability P,(A,)dA, that for a given No the deviation of 
Xq from No lies between A, and A, + dA, is given by expression 
(10), in which we must put n= No; p=. We thus find: 

AL 


1 os 2aNo dA % (37) 


Pi(A,)dA, “lea 
V2raNo 


THEORY OF IMITATIVE BEHAVIOR 101 


Similarly, we find for the probability P2(A,)dA, that Yo deviates 
from BN» by an amount which lies between Ay, and A, + dAg: 
Ay 
Pa(As)dAg == oe 2AM gay. (38) 
V27BNo 
As has been remarked above, the approximate validity of (10) is 
contingent upon inequality (11), which in the present case means 
that aN» >> 0. This assumption is rather plausible. If we con- 
sider, for example, that « = 0.01, that is, one per cent of the whole 
population is active and prefers R,, then for a community with 
No = 1000, we have already aN) = 10. If « and B are both equal 
to a few hundredths, the condition of approximate validity of (10) 
is definitely justified. 
When Xo and Yo have the values given by (36), then, introducing 
them into (34) we find: 
p= F(AN) + Ay; BNo + Aa; No). (39) 


For a fixed value of » equation (39) gives a relation between 
A, and Az, which must be satisfied in order that » remains con- 
stant in spite of the fluctuations of X9 and Yo. 

Solving (39) with respect to As, we obtain an expression of the 
form: 


A, = QA4No +Ay3 BNo; No). (40) 
From (40) we obtain: 
dQ 
eae as (41) 
or, putting 
22 _ A(aNo + 413 BNo; No), (42) 
oA, 
dA» = A(aNo Te Ais BNo; No)dA,. (43) 


Equation (40) determines a curve in the (A;,A2) plane. At all 
points of this curve p is constant. 

The probability that the deviation of Xo from aNo lies between 
A, and A, + dA, and that at the same time the deviation of Y, from 
BNo lies between A, and A, + dA, is given by the product of ex- 
pressions (37) and (38), that is, by P;(A1)Pa(Az2)dAidAg. In order 
to find the probability P()dp that p lies between p and p + dy, we 
must integrate P;(A,)P2(A2)dA;dAy over all the valuss of A; and 
A, which lie within the strip contained between the curve p = const 


102 N. RASHEVSKY 


and » + du=const. To do this we change our variables from A1 
and A» to s and p, where s is the distance along the curve p = const. 
We have (Figure 1) 


ds = V(dA 1)? oF (dAg 2 e (44) 
or, using (43) and omitting the designation of arguments in A: 
ds =V/1+ A? dA,. (45) 
4 ww 
» dA, 
a 
dA; 
FIGURE 1. 


Further we have from Figure 1: 


et SL. Seay ETE (48) 


dao -; dA = duyi+ AP. (47) 
Hence 
dA ,dAg = dsdp. (48) 
From (45) we also have, by integrating: 
A, = f(s) + const. (49) 


Without any loss of generality we may put the constant equal to 
zero, counting the distances along the curve from the point of its 
intersection with the A, axis. In the particular case where A does 


THEORY OF IMITATIVE BEHAVIOR 103 


not depend on Ais we find: 
A, = 
1.=—__., 


Substituting for A, the expression (40), then for A, the expres- 
Sion (50), and for dA;dA, the expression (48), we find: 


P,(A;)P(A,)dA dA, 


8 Ss 
ey Pee ate Gee : 
( a) f le fuvo + 3 BNo; Ns) duds. (51) 


Hence, from what was said above, the probability P(u)dy that pz 
lies between yp and p + du is, because of (37) and (38): 


d en = O?(a.No + sei AN; No) 
= s ~ 2aN,0+A2) 
Oar Re oy [ Ear: fae ds. (52) 


This expression gives us the distribution function of p for a set 
of communities of size No. 
It must be noted that in general 


co 


[ P(ndue <A, 
0 


because for some choices of the function F in (34) » becomes in- 
finite when X, is equal to or greater than a certain value, while » 
becomes zero when XQ is equal to or less than another certain 
value. [See expressions (31) and (32).] This may not be the case 
for some other choices of the function F' in (34). 

If, however, we use expression (29) which implies (31) and (32), 
then strictly speaking the limits of integration in (52) should not 
be —« and +o. The limits are actually determined by the condi- 
tion that the values of Xp) and Yo, as given by (36), should noz 
satisfy inequalities (31) and (32). Or, to put it differently, those 
values should satisfy the inverse of inequalities (31) and (32). 
This holds for all the following expressions in this paper which 
are based on expression (29). 

Practically, however, we do not make in general too great an er- 
ror in using —~ and +o as limits of integration because for suffi- 
ciently large values of s the integrand in (52), as well as in other 
similar expressions, is very small and contributes insignificantly 


104 N. RASHEVSKY 


to the value of the integral. The only exception would be the case 
of a very large standard deviation of the integrand or in the case 
where a and/or 8 are chosen so that the values of Xo and Yo 
given by (36) are very near to those which satisfy the inequality 
condition in (31) and (32). Expression (52) has a maximum for 


Hmax= F(ANo; BNo; No). (53) 


In general, for any function F we shall find relations similar to 
(81) and (32). In other words we shall find that 


peo if X)> V(Y 59 No), (54) 
and 
w=0 if X <W(Yo,No). (55) 
Introducing (36) into (54) we find: 
A, SV(BN, + Ag; No) — &No. (56) 


The probability of inequality (56) for a given A, is given by 


2 
+00 Ps Ee 


1 e 22No GA, (57) 


V270No see aNo 


The probability of A, being between A, and Ag + dAg is given 
by expression (38). Hence the probability of inequality (54) for 
any possible A, is given by 
: +co Ai 


Ay 2 
dAxe ah e 25No GA. (58) 


—co 


+00 


1 
2n7NoV ap VC BNot A 93 No — &No 


co 


P.o is a function of No. Expression (58) gives us the probability 
that amongst communities of size Ny such communities will be 
found in which » = ©. Those are such communities which exhibit a 
“‘solid”? behavior &,, That is, with the exception of the small 
number Y, of actives of type Ra, all individuals exhibit behavior 
R,. Similarly we find an expression for Po, the probability com- 
munities with » = 0, which exhibit ‘‘solid’’ behavior Ro. 

Let ¥o be the total number of communities of different sizes, 
and (No), the distribution function of the size of communities. 


Then 
RoR(No)dNo (59) 


is the number of communities the size of which lies between No 


THEORY OF IMITATIVE BEHAVIOR 105 
and No + dNo. In that case 


+00 


a Mo f P..(No)2(No)dN o (60) 
0 


gives us, for sufficiently large values of Wo, the number of commu- 
nities of all sizes, which exhibit a ‘‘solid’’ behavior Ry. A similar 
corresponding expression is found for the number of communities of 
all sizes, which have » = 0 and in which ‘‘solid’’ behavior Rz is 
exhibited. 


The expression 


co 


P(u)dp = dy [ P(u)9(No)dNo, (61) 


where P(z) is given by (52), gives us the fraction of a// commu- 
nities, of all sizes, which have a p between pz and p + dp. 


IV 
As an illustration we shall apply the general expressions of the 
preceding section to the case in which yp is given by (29). Because 
of (27), expression (29) contains quadratic terms in Xg and Yo. We 
shall linearize it by considering the very plausible case in which 
N* 3X03. -N7S>"Y o> (62) 
In this case we have approximately 
N’ = No, (63) 
and expression (29) becomes 


7 Bako = Cael = &No)Yo =< aNo (64) 


Bes guy ot tia NG Xe ONG” 
This expression corresponds to (34). 
Expression (40) now becomes: 
s - - H1-4)N 
re aol[1+ u(1 aul (ANo Ay) a u)No — BNo. (63) 
Co (1+ p- No) 
In expression (43) A becomes 
* 
AZ ao [1 + u(1 - N)] (66) 


cp (1+ w - No) 


106 N. RASHEVSKY 


Putting 
Be aol ar mei = eNo)] aNo — al —2)No - BNo (67) 
c6(1 + p -— No) 
we have now for (40) or, which is the same, for (65): 
A, = AA, + B= Q(aNo +A); BN; No)- (68) 
Expression (52) now becomes, because of (50): 
A Bi 
2 eer EE. 
du “2aNo(1+4?)  —-2 BN, 
P d SS é : 0 ds. 69 
(u) au seo aE (69) 


The exponent in (69) may be written: 
_ (420 +.8)82 +2A4Bayi+ A? s_ B? 
2aBNo(1 + A?) 2BNo 


Multiplying the numerator and denominator of the first fraction of 
(70) by 


(70) 


A?x +B 
AUB G42 Ay 


then adding and subtracting 1 to the numerator, the latter is brought 


into the form: 
A*4 +8 3 
(GA + A* 


After elementary rearrangements, expression (70) becomes 
_ ((4?0 + B)s+ ABay1+ A7)? | BB? (71) 
2XBNo(1+ A7)(A7a +8)  28No(A20 +B)” 
Introducing this into (69) we obtain: 
P(u)dp = 
aes ¥i BB =i _ [(A?at 8) st ABav1+A? J? 
Rites Party ~ 2 BN£A? a+ B) 2a BNo( 1 +A?)( A? a+ 8) ; 2 
2nNoV AB ‘ [ z : rere 


=oo 


The integral is of the form 


too _ (ks+m)? 


Per a (73) 


THEORY OF IMITATIVE BEHAVIOR 107 


Hence (72) gives 


fieesA* ee a ee 
i e INoCAa+B gy. (74) 
V2mNo(A2a + B) 


Introducing for A and B the expressions (66) and (67), we obtain 
P(u) as a function of np, 4, 8, and No. Setting its derivative with 
respect to » equal to zero, we find that P(x) has a maximum for 


é a5aNo — €6(1 — &No) BNo — aNo 
COBNo — 49 (1 —€No) ONo — GNo 1 
The same value of y» is obtained by putting Xp) = ANo, Yo = BNo in 
(64). In other words, the ‘‘natural’’ value of yp, that obtained for 


zero fluctuations, is the most probable. 
Expression (58) now becomes of the form: 


(75) 


1 


+co 
PL =————— G 
= oniNas | (7 + Ax)g(x)da, 


where 


a) ="; O(a) = | 2)ae. 


V 


In all the preceding discussions we considered the individual 
communities as independent of each other. Actually the individ- 
uals of each community know something of the behavior of at least 
some of the neighboring communities. Therefore the communities 
do influence each other’s behavior. The influence is the greater, 
the better the communication of information between the communities. 
' The mutual effect of the communities for the case of a continuous 
N(¢) has been investigated to some extent (Rashevsky, 1953). It 
was found that in general, when the communication is equally ef- 
fective both ways, the larger community will determine the choice 
of behavior R; or Re of the smaller community. If the communica- 
- tion is more or less unidirectional, that is, if one community knows 
more about the other than the other knows about the first, then the 
second community may determine the behavior of the first, even 
though the second is smaller in size. The ‘‘n-body problem’’ of 
the case is very difficult and has not even been attempted. 


108 N. RASHEVSKY 


In the approximate picture, treated in sections III and IV, the 
situation may be handled in the following manner so far as the 
‘*two-body problem’’ is concerned, 

Let the total number of individuals in a large community be Nor, 
that in a small one—N,,. Let Xo15 X02, You, and Yog stand for the 
corresponding values of Xo and Yo. If the smaller community is 
very near the larger one, then the behavior of its passives may be 
determined by the sums Xo1 + X02, Yo1 + Yoo, which are then to be 
substituted in the corresponding expressions for X,) and Yo. Ata 
distance 7 from the larger community, the effects are reduced and, 
instead of Xo, + X99) Yo: + Yoo, we use the expressions 


Xorf(2) + X02; Yorf(2) + Yor (76) 
where f(/) is a decreasing function of 7, such that 
f(0)=1; f(~) =0. (77) 


Then at very large distances the effect of the larger community be- 
comes negligible, and we have the cases studied in the preceding 
sections. For finite and not too large distances the behavior of the 
smaller community is determined by expressions (52) and (58) in which 
we simply substitute for Xo = ANo and Yo = BNo the expressions: 


ONg + Xorf(2); BNo + Yosf(l). (78) 


If the larger community is very large, then 


Xo = No1; | Yor = BNos3 (79) 


and therefore expressions (52) and (58) give us the different quan- 
tities which characterize the behavior of the smaller community in 
terms not only of %,8,No., but also in terms of Noy, ‘the size of 
the larger community, and of 1, the distance between the two 
communities. 

If the whole society consists of one very large community and a 
large number of smaller ones and if in the first approximation we 
neglect the interactions of the small communities, we may general- 
ize the expressions (60) and (61) so as to include the distances 
from the one large community. If M(l)dl denotes the fraction of all 
communities, which lie at a distance between 7 and 7 + di from the 
one large community, then instead of (60) we have: 


Pee [ [ Poo(Noy Nory 2)(No)M(2)dN dl, (80) 


THEORY OF IMITATIVE BEHAVIOR 109 


while (61) becomes: 


co 06 


P(u)du = dp s} a P(u; No; Nor; 2)%(No)M(2)dNodl. (81) 


Oo “9 


The above approach has the following disadvantage. Since in 
general » is a function of Xo, Yo, and No, but not necessarily of 
the ratios X9/No, Yo/No, therefore a small community near a large 
one will have a y different from that of a large one. This is rather 
unplausible since, in the limit when ]=0, the two communities 
merely fuse into one. This would not be the case with expression 
(64), but it would be with the more general expression (29). Inci- 
dentally we may remark that the method is not applicable at all to 
the illustration of section IV, based on expression (64), because 
for small values of J, Xo; may become much larger than Xo; and 
Yo;—much larger than Yo2. In this case the inequality (63), which 
justifies (64), may not hold. 


VI 


In a recent paper (Rashevsky, 1956) we attempted to derive the 
distribution functions N(¢d) from the distribution function N;(2’) of 
economic need 2’ and the distribution N.2(z) of income. In this 
case ¢& refers to the preference towards either the maintenance of 
the economic status quo or towards a change of the existing eco- 
nomic situation. Mutatis mutandis, the same type of derivation may 
be applied to the case of distributions of ‘‘social aspirations’’ and 
distribution of social status. In this way the study may be con- 
nected with our previous study of the general mathematical theory 
of distribution of social status (MBSB, chap. viii). 

If we wish to apply the present approach to that more complex 
case, we will have to first express the probability of given devia- 
tions from N,(i*) and N.(z) in communities of smaller size by the 
method outlined in section II. Then, by means of equations (8) and 
(9) of the previous paper (Rashevsky, 1956), we shall obtain the 
nrobability of a given deviation from N(¢). 

In the special case studied previously, in which N9(z) is given 


by 
Na(é) = ce~*4, : Oey 


110 N. RASHEVSKY 


where £ denotes the average per capita income, we may introduce 
c 


a dependence of the parameter c on the size No of the community, 
based not on considerations of probabilistic fluctuations but on 
some special economic considerations. Then Na(?) will have the 
same form for communities of all sizes, but the parameter will dif- 
fer from community to community. On this there will be superim- 
posed the probabilistic fluctuation, studied in the preceding sec- 
tions. To the extent that the needs may be and probably are de- 
termined not merely by the psychophysiological characteristics of 
an individual but also by the socioeconomic environment in which 
the individual lives, the above may be said also about the function 
N,(2’). Both for N;(2’) and N2(z) there may be a ‘‘causal’’ depend- 
ence of the parameter of those functions on the size No of the com- 
munity as well as the fluctuations, the probabilities of which de- 
pend on No. The problem thus becomes much more complex and 
possibly more interesting mathematically. 


VII 


In chapter 13 of his recent book Wright Mills (1956) discusses 
the transition in the United States from a ‘‘society of publics’’ to a 
‘*society of masses.’’ Such a transition is of course not limited to 
the United States. Nor has the actual transition occurred between 
complete extremes of a ‘‘society of publics’? and a ‘‘society of 
masses,”’ as Mills explicitly points out. One of the important dif- 
ferences between a society of publics and a society of masses is a 
greater heterogeneity of opinions in the former, a freer discussion, 
and a larger ratio of ‘givers of opinions’’ to the ‘‘takers.”’ 

We shall now discuss how some aspects of this sociological 
phenomenon may be interpreted from the point of view of a mathe- 
matical theory of mass behavior. Like any mathematical interpreta- 
tion, the one suggested here must of necessity be to a large extent 
oversimplified and idealized. It may, however, serve the purpose 
of a better understanding of the observed phenomena and may well 
suggest in the future new avenues of approach for empirical studies. 

An opinion, if it has been stated or made known in some manner 
so as to become the object of a scientific study, is a particular 
form of behavior. If we imagine an actually nonexistent case of a 
society in which only two mutually exclusive opinions would be 


THEORY OF IMITATIVE BEHAVIOR va hel 


possible, then we could apply all the above considerations to the 
two opinions. The quantity d would then be a measure of the pref- 
erence towards one or the other opinion. 

In the absence of imitative behavior and for the case of a func- 
tion N(¢) which is symmetric with respect to d =.0, we found that 
half of the population will have one opinion, ‘the other half—the 
opposite opinion. Moreover, at different times different individuals 
will express a given opinion. If N(d) has a maximum at 4 = 0— 
the most likely situation—then the greatest number of individuals 
will change their opinion from time to time. Only individuals with 
either a very large positive ¢ or a very large negative d, ‘that is, 
individuals with very strong preferences towards one of the two 
opinions, will very rarely change their opinion even temporarily. 

In the presence of imitation effects, the situation is different. 
The majority of the population will have one opinion, the minority 
—the opposite one. For proper values of the parameters involved, 
the majority may be very pronounced, the minority almost negligible. 
The number of individuals with a small value of ¢ is now reduced 
(Rashevsky, 1956). Therefore there are also less individuals 
who change their opinion frequently. The fraction of individuals 
with a set opinion which is that of the majority is much greater. 
Though in a very remote way, we definitely notice an analogy be- 
tween a society without imitation behavior and a society of pub- 
lics on one hand, and between a society with imitation behavior 
and a society of masses on the other. 

What can cause a transition from a society without imitative be- 
havior to a society with imitative behavior? Such a transition may 
be caused by the variation of any of the different biological and 
sociological parameters which enter in our equations. But one of 
the simplest and most natural causes of such a transition is an in- 
crease in the size of the society. As we have seen in MBSB, chap. 
xiii, in order that imitation would result in a shift of the majority 
of the society towards one of the two opinions, the following in- 
equality must be satisfied (MBSB, p. 100): 


No > ae, (83) 


In this inequality o and & are the parameters of the function N(d¢) 


112 N, RASHEVSKY 


and of Landahl’s psychophysical discrimination function P(¢), re- 
spectively. The constants a and A are purely neurological param- 
eters. Inequality (83) must hold for the particular type of distribu- 
tion functions N(d) and P(¢), studied in MBSB. If M(¢) is-a normal 
distribution and P(#) is given by the integral of a Gaussian distri- 
bution function, then instead of (83) we have (H. Landau, 1950) 


a 
No > BETTE (84) 
where o-and f are standard derivations of the two Gaussian functions. 
In either case, as has been discussed in detail elsewhere (MBSB, 
chap. xiii), the population No of the society must exceed a minimal 
value for imitation to become effective. Hence for a given set of 
values of a, A, o, and k, a society of sufficiently small size will 
not show any imitative mass behavior. However, for constant 
values of a, A, c, and k, as the population increases, a moment will 
come when rather suddenly imitative behavior will set in, and a 
majority in the society will adopt one of the two opinions. And 
what is important is that the larger No, the greater the ‘‘prevailing”’ 
majority. If p» is the ratio of the number of individuals which adopt 
the majority opinion to the number of the minority, then » tends to 
infinity with No, at least in the simplest theoretically studied 
case of a homogenous society in which imitation effects do not 
depend on the distance between individuals. (For somewhat more 
realistic cases see Rashevsky, 1950b, 1953). 

This effect of No is not always reflected in the approximate the- 
ory, treated in section IV where p» is expressed by equation (64). 
Let the latter be simplified by a special choice of the parameters. 
If we put aj = co, that is, consider both class of actives as having 
the same coefficients of influence (Rashevsky, 1948; MBSB), and 
if we put « = e’=1/No, that is, assume that the efforts of the ac- 
tives cease only when the whole population behaves the way they 
want, then (64) becomes, putting a/ap = a/c} = ay, 


- Xo -a:No_ A - ay 


m 
Yo-@:No B-ay 


(85) 
Thus p» is independent of No. This is due to the approximations 
made, 

Thus, if for a moment we hold on to our remote analogy, men- 
tioned above, we find that an increase in the size of a society will 


THEORY OF IMITATIVE BEHAVIOR 113 
eventually bring a change from a society of publics to a society of 
masses. 

The above analogy is, however, hardly of any value, principally 
because of our limitation to the case of only two mutually exclu- 
Sive opinions. This limitation is connected with the limitation to 
the neurological model of two conflicting stimuli, inherent in Lan- 
dahl’s (1938) theory of psychophysical discrimination. A generali- 
zation to the case of any number of mutually exclusive stimuli has 
been attempted by Landahl himself, with considerable success, in 
his theory of learning (Landahl, 1941a,b; Rashevsky, 1948, chap. 
xli). The generalization was, however, successful only for the 
particular situation studied. Other attempts (Rashevsky, 1950a) 
have not been crowned with success. The problem is of vital im- 
portance in many fields of mathematical biology and presents con- 
siderable difficulties. 

What we deal with in reality are not simple elementary behavior 
acts, but complex behavior patterns. This is particularly true of 
opinions considered as particular cases of behavior. The follow- 
ing artifice may lead to a solution of our problem. 

Each behavior pattern consists in general of a number of ele- 
mentary behavior acts, and in many cases each such elementary 
act a; has an opposite act @;, such that a; and @; are mutually 
exclusive. 

Consider n such elementary acts a,i=1,2,...,n). A behavior 
pattern may consist of any combination of a; and a@;, in which two 
opposite acts a; and G; do not occur. If for example we have 7 = 2, 
then we have the following possible combinations: @,a,, a)@, 
@2@1, G14. We shall explicitly limit ourselves here to the case in 
which the elements a; are independent of each other. Each com- 
bination represents a particular behavior pattern. All of those pat- 
terns are mutually exclusive but not mutually disjoined because 
two different patterns may have an element or more in common, 
like a,;@q and a,@. If in a behavior pattern consisting of a ele- 
ments we substitute for one of the elements its opposite, we obtain 
a different behavior pattern which, however, shows a great degree 
of similarity to the first. 

We may illustrate the above by the following example. Let the 
behavior patterns be opinions. Let a, stand for belief in God, 
a,—for belief in spooks, a;—for belief that the world is flat. 


114 N. RASHEVSKY 


Then an individual with an opinion pattern a@,@2a, believes in all 
three things. An individual with a,@2@3 believes in God but does 
not believe in spooks and that the world is flat. An individual 
with @,a,@, does not believe in God but believes in spooks and 
does not believe that the world is flat. If we make ag stand for be- 
lief in the infallibility of the church, then we would have an im- 
possible combination, @,@2a3: a man not believing in God but be- 
lieving in the infallibility of the church. The situation is impossi- 
ble because a, > a,, and hence a, and ag are not independent. 
Hence the necessity to limit ourselves to the case of independent 
a;’S~ 

If there are n independent elementary behaviors, then the total 
number w(n) of possible different behavior patterns is 


w(n) = 2", (86) 


This is readily proved by induction. Equation (86) holds, as we 
have already seen, for n=2. We shall prove that it holds for n+ 1 
if it holds for n. All the patterns for n + 1 may be obtained by add- 
ing to each of the 2” patterns either a,4; Of @,+;. This makes the 
total number of patterns w(n + 1) equal to2 x 2"=2"*'. 

Each pair a@;, a; of opposite elementary behaviors may be treated 
as a pair of mutually exclusive behaviors R;; and Rj, and all the 
discussions in the preceding sections apply to this case. For each 
a; we have now a quantity d; which measures the preference of the 
individual for either a; or a;. The distribution function N,(¢,) will 
in general be different for different indices 7. Hence for different 
pairs of elementary mutually exclusive behaviors we shall have 
different values o; and &; of the parameters o and & in inequalities 
(83) or (84). Similarly we shall have different values A; and a; for 
A and a. Hence there will be n different values for the right side 
of inequality (83) or (84), and we can arrange them in increasing 
order. 

If No is less than the smallest value of the right side of (83) or 
(84), then the effect of imitation does not manifest itself on any of 
the a;. If all N,(d;) are symmetric with respect to ¢; = 0, then for 
sufficiently small values of Ny, when imitative behavior does not 
yet set in, all 2” patterns will be equally probable. Each pattern 
will be exhibited by the fraction 2~” of the whole population, since 
the probability of each a; or @ is equal to 4, and the a,’s are not 


THEORY OF IMITATIVE BEHAVIOR 115 


correlated, as we have assumed from the outset. Speaking in terms 
of opinions, we have a society in which all of the 2” possible 
opinion patterns are equally represented and changes of opinions 
are frequent, because for unimodular distributions N (¢;) all of 
those functions have a maximum at ¢d;=0. We now have a some- 
what better analogy to a society of publics. 

As No increases, it will exceed some of the values of the right 
side of (83) or (84). Imitation behavior will set in for the corre- 
sponding pairs a;, a;; and in each such pair one of the two ele- 
ments will be adopted by the majority of the society. Let the num- 
ber of indices z for which this will happen be r<n. Of those pairs 
let there be 7; pairs in which the ‘‘unbarred’’ behaviors aj, ai,, 
eee ,ap are accepted by the majority, and rg pairs in which the 
‘‘barred’’ behaviors aks Sis Gee Th, are exhibited by the majority. 
We have of course 

ftfo at. (87) 


The probability that an individual exhibits an af or an a; is now 
greater than yy therefore the probabilities of the different patterns 
are no more equal. Patterns which contain the combination ai, ai, 
ee qj, ak, oes ak, will have a higher probability than the others. 
Let Hi, and p,, be correspondingly the ratios of number of individ- 
uals which exhibit a? or ak, to the number of individuals which 
exhibit the opposite, that is, 7]. or aj,. Then, according to what 
we just said: 


Hi; > a and eee oe (88) 


But those p’s are equal to the probabilities that a given individ- 
ual exhibits the corresponding elementary behaviors ar, or aie pe 
We shall denote those probabilities by Pi, and p,,, and we have 
because of (88): 

Pi;>o> Pin? a (89) 

The probability of any pattern containing all of the r elementary 
acts for which imitative behavior is manifested, that is, of any pat- 
terns containing the combination 

ai, Qi,+ ++ Gi, Thee Tk, 
is given by 


ur) 


ry 
Pe eesisen om Pi; is Pine (90) 


116 N. RASHEVSKY 


The factor 27°"~” is due to the n —f pairs a;a@; for which imita- 
tion behavior has not set in. As No increases, so does 7. Finally 
as r becomes equal to n, there is only one pattern out of 2”, which 
consists of only the ‘‘majority’’ elementary acts. That pattern will 
be accepted by a relative majority of the population but not neces- 
sarily by the absolute majority. The probability of that pattern is 
obtained from (90) by putting r=n and remembering that now 
r, +f. =n, because of (87). We now have: 


T1 n~Ty 


3H yiasetl : 91 
eto rae ~ 


Any other pattern will consist of some p elements ai, Or Gkas 


and of n — p other elements, the probability of which is +. Hence 
the probability of any such other pattern will be given by (90) in 


which we substitute p for r. Because of (89) we have 
pee ae (92) 


for any p > 0. Hence Py, is the greatest of all probabilities possi- 
ble and the pattern with this probability is exhibited by the greatest 
number of individuals. Nevertheless P,, may be less than i. This 
ney, happen if the Pi; ’s and the p;,’s are only slighty eaaber than 
> + and if nis sufficiently large. If, pomeEsh the pi, *s and the p;;,’s 
a xery, gloge to as then P,, >> and the bahavint pattern 
Oi, oe eGi,, G,++-Gh,, is exhibited by re absolute majority. This 
absolute majority may be as large as we wish, if for a given n the 
Pi;'S and the p;,’s are sufficiently close to 1. This can always 
happen for certain values of the parameters Aj, a;, 0;, and kj. 

Hence as No increases or, in other words, as the society grows 
in size some of the possible 2” opinion patterns will become more 
frequent than others until for sufficiently large values of No one 
particular pattern will be exhibited by the majority. For some 
values of the parameters this majority may be very large. All the 
remaining 2” — 1 opinion patterns may be relegated to only a very 
small group of individuals. We have here a somewhat better analogy 
to a society of masses, 

As we have seen, the frequency of incidence of a given opinion 
pattern, or its probability, is expressed in terms of the probabilities 
Pi, and p,, of the individual independent elementary opinions, But 
those probabilities, as we remarked, are equal to the corresponding 


THEORY OF IMITATIVE BEHAVIOR 17 


values n;. and Hz,» The latter ones, however, can be calculated 
by the methods discussed in the preceding sections. The equa- 
tions developed in those preceding sections will give us the fluc- 
tuations of the values of np, and x, for communities of different 
Sizes, and thus we shall obtain different quantitative characteriza- 
tions of the fluctuation of the ‘‘opinion structure’ of different size 
communities. Such quantitative characterizations may well be 
amenable to experimental verification. 

To the extent that the method outlined in section II has not yet 
been developed, we shall have to use the expressions derived in 
section II]. As we already mentioned, they may be considerably 
simplified by using the assumptions which lead to equation (85). 
With those assumptions we find from (66) aud (67): 


eS 7p Ga (93) 
m 
and (74) becomes: 


Laqo, Fula Bl 

14 p? Seyret reas a 

Phage ase Fg 2 (at Bu) du. (94) 
2nNo(a + Bp?) 

If all the Pi,= i; and py, = ux, Tepresent probabilities of dif- 

ferent patterns, their sum must be equal to 1. That this is so is 


proved by induction. Consider the case of n=2. Let p, and 
91 = 1-7; be the probabilities (or incidences) of a; and a. Let 


Po and go = 1 — pg be the corresponding probabilities for a, and dp. 
The probabilities for the four patterns a@1@2, @1@2, @201, and G24 
are correspondingly 
P1P2»P1(1 - Pa), Po(1 — Pi)» (1 — p1) (1 — Pa) (95) 
Their sum equals 1. 
Suppose now that this holds true for n. We shall prove that it 
then holds for +1. Denote the 2” probabilities by Py (n= 
1,2,...,2”). We have, according to the assumption: 


“ead? 
ye Py mil: (96) 
bf [tach § 


All the 2"*! patterns for n+1 are obtained by adding either 
Gn+1 OF G+; to each of the 2” patterns already given. Let Pnvt = 
lin+1 be the probability of a,41, and 1— Dn+ 1——the peer of 
Gn+1« Then the probabilities of one half of the number of 2” 1 new 


118 N. RASHEVSKY 


patterns will be given correspondingly by 


PaPnatis (n = 1,2,.0052')s 
while the probability of the others will be given by 


P, (1 — Pn+ 1) 
The sum of all the probabilities is 


Qn gn 
S PaPn+1 + yas Py (1 - Pn+1) = 
n=1 nai 
Qn Qn Qn 
Patt) Py +(1- past) )- Paiste P,=1. (97) 
n= n= 1 s bat, 


Hence, for all n’s, 2Py = 1. 

In the above-discussed cases as No grows, one of the many pos- 
sible opinion patterns becomes strongly prevailing due to imitation 
and is exhibited by almost the entire population. It is, however, 
actually the opinion of the very minority of ‘‘actives’’ who exhibit 
exactly that combination of elementary opinions and who have a 
particularly strong preference for each element of that combination. 
We have here something akin to the small ratio of the number of 
givers of opinion to the number of takers, discussed by Mills (Joc. 
cit.). The givers are those few individuals who even in absence of 
imitation behavior have a very strong preference for the particular 
pattern and hardly ever deviate from it or change their opinion. 
The fraction of those individuals is much smaller than 2~”. Most 
of the individuals who exhibit in absence of imitation a given pat- 
tern have not a particularly strong preference for it. The individual 
members of that fraction also constantly change, the equilibrium 
being a dynamic one. 

As we said at the beginning of this section, we attempted to in- 
terpret only some aspects of the transition from a society of pub- 
lics to a society of masses. We shall only barely suggest here the 
possibility of interpreting another aspect in terms of the concepts 
of section V. We suggested there a theory of mutual influence of 
different communities. We introduced the distance function f(/) to 
describe the effect of a larger community upon the behavior of a 
smaller one. In general the effects of both communities on each 
other will be reciprocal, though not equal. Instead of (78) we shall 


THEORY OF IMITATIVE BEHAVIOR 119 


have generally similar expressions describing the effects of the 
second community on the first, with in general a different function 
fo(2). The sizes of the two communities may even be equal. But if 
the function f(J) and f.(2) are different, the effects of the first com- 
munity upon the second will be different from the effects of the 
second upon the first. To the extent that those effects influence 
the distribution of different behavior patterns, in particular of dif- 
ferent opinions, we may have here a simple mathematical represen- 
tation of ‘fone way’’ opinion flow, discussed by Mills. 


The author is indebted to Professor Herbert D. Landahl for a 
discussion of this paper and for valuable critical comments. 

This work was aided by a grant from the Dr. Wallace C. and 
Clara A. Abbott Memorial Fund of The University of Chicago. 


LITERATURE 


Landahl, H. D. 1938. ‘‘A Contribution to the Mathematical Biophysics 
of Psychophysical Discrimination.’’ Psychometrika, 3, 107-25. 

- 194 la. ‘*Studies in the Mathematical Biophysics of Discrimina- 
tion and Conditioning I.’? Bull. Math. Biophysics, 3, 18-26. 

194 lb. ‘*Studies in the Mathematical Biophysics of Discrimina- 
tion and Conditioning Il. Special Case: Errors, Trials, and Number 
of Possible Responses.’’ /bid., 3, 71-7. 

Landau, H. G. ‘‘Note on the Effect of Imitation in Social Behavior.”’ 
Bull. Math. Biophysics, 12, 221-35. 

Mills, C. Wright. 1956. The Power Elite. New York: Oxford University 
Press. 

Rashevsky, N. 1949a. ‘‘Mathematical Biology of Social Behavior.’’ 
Bull. Math. Biophysics, 11, 105-13. 

. 1949b. ‘‘Mathematical Biology of Social Behavior: II.’’ /did., 
11, 157-63. : 

. 1949c. ‘*Mathematical Biology of Social Behavior: III.’ /bid., 
1], 255-71. 

1950a. ‘*Psychophysical Discrimination with More Than Two 

Stimuli.’’ /bid., 12, 215-20. 

. 1950b. ‘*Mathematical Biology of Social Behavior: IV. Imita- 

tion Effects as a Function of Distance.’ /bid., 12, 177-85. 

. 1951a. Mathematical Biology of Social Behavior, Chicago: The 

University of Chicago Press. 5 

. 1951b. ‘*A Note on Imitative Behavior and Information.’’ Ibid., 
oe lace “Imitative Behavior in Nonuniformly Spatially Distributed 
124 ions.’’ /bid., 15, 63-71. 

ae ‘Studies in Mathematical Biosociology of Imitative Be- 

havior: I. Effects of Income Distribution.’’ /bid., 18, 323-36. 


RECEIVED 10-1-56 


Ki > mute tae esas algime 


«8 4 Sq, 


BULLETIN OF 
MATHEMATICAL BIOPHYSICS 
VOLUME 19, 1957 


A STATISTICAL MECHANICS OF INTERACTING 
BIOLOGICAL SPECIES 


EDWARD H. KERNER 
PHYSICS DEPARTMENT 
UNIVERSITY OF BUFFALO, BUFFALO, NEW YORK 


The system of differential equations proposed by V. Volterra, de- 
scribing the variation in time of the populations N, of interacting species 
in a biological association, admits a Liouville’s theorem (when log “ihe are 
used as variables) and a universal integral of ‘‘motion.’? Gibbs’ micro- 
canonical and canonical ensembles can then provide a thermodynamic 
description of the association in the large. The ‘‘temperature’’ measures 
in one number common to all species the mean-square deviations of the 
N, from their average values. There are several equipartition theorems, 
susceptible of direct experimental test, a theorem on the flow of ‘‘heat’’ 
(the conserved quantity in an isolated association) between two weakly 
coupled associations at different temperatures, a Dulong-Petit law for 
the heat capacity, and an analog of the second law of thermodynamics 
expressing the tendency of an association to decline into an equilibrium 
State of maximal entropy. The analog of the Maxwell-Boltzmann law is a 
distribution of intrinsic abundance for each species which has been 
successfully used by ecologists for interpreting experimental data. A 
true thermodynamics develops upon introducing the idea of work done on 
an association through a variation of the variables (such as physical 
temperature) defining the physical and chemical environment. An ergodic 
theorem is suggested by the agreement of ensemble and time averages in 
the one case where the latter may be found explicitly. 


1. Introduction. . 

It has often been noticed that the science of population dynamics 
ought to be capable of some description in statistical or thermo- 
dynamic-like terms such as is provided for the Newtonian mechanics 
of a system of particles by the theory of statistical mechanics. 
For example A. J. Lotka (1925) has remarked ‘‘... what is needed 
is an analysis...that shall envisage the units of a biological 
population as the established statistical mechanics envisage mole- 
cules, atoms and electrons; that shall deal with such average ef- 
fects as population density, population pressure, and the like, 


121 


122 EDWARD H. KERNER 


after the manner in which thermodynamics deal with the average ef- 
fects of gas concentration, gas pressures,....’’ Some such sort of 
analysis has indeed found expression in various systems of differ- 
ential, or integro-differential, or other kinds of equations, which 
supposedly control the population numbers of the interacting spe- 
cies in a biological association. Notable examples are to be found 
in the works of V. Volterra (1931, 1937). 

Now quite clearly these theories are of the nature of statistical 
theories in that they lay no claim to a power of detailed and pre- 
cise prediction about the population numbers of any single well- 
defined biological association, but at best are concerned with 
average or most probable numbers. Implicitly they refer to ensem- 
bles of similar biological associations and in experimental tests 
assume that the one system under test is not appreciably different 
from some kind of most probable system. Thus the theories are 
characterized at the outset by a high phenomenological content at- 
tempting to describe directly the behavior of such a most probable 
system. This is evidenced by the absence of means of finding 
fluctuations from most probable population numbers; and by the ap- 
pearance of numerous parameters, such as ‘‘coefficients of self- 
accretion,’’ which remain unevaluated from any set of first princi- 
ples but must be found from an experimental observation of test 
systems, much as the decay constant of a radioactive element re- 
mains an empirical constant when there is no underlying quantum 
theory to account for it. 

In short, the theories of population interaction are the statistics, 
or a part of them, in the form of surmises and empirical laws, with- 
out the mechanics. They are loosely thermodynamic-like to this 
extent, but not at all of a statistical-mechanical nature. This is 
of course inevitably so because of the colossal complexity of the 
mechanics over which the statistics must be done. 

Yet one might still ask of statistics that it speak further about a 
biological association. For, starting now with a phenomenological 
description, such as Volterra’s, one finds another order of com- 
plexity as soon as the number of interacting species exceeds even 
a few: the equations are not amenable to explicit solution by avail- 
able methods. The situation becomes somewhat analogous to that 
in the classical mechanics of many interacting particles, where 
between the known laws of motion and the knowledge of the motion 
hangs a deep mathematical fog, penetrated faintly but importantly 


INTERACTING BIOLOGICAL SPECIES 123 


by a few conservation laws. It is just in such a state of ignorance 
that there is room for statistical considerations, for statistical 
mechanics proper. Can then biological associations of many inter- 
acting species as whole entities, like whole mechanical systems 
of many degrees of freedom, be characterized in their entirety by 
equilibrium states in the thermodynamical sense? by a tendency to 
decline into such states? by variables of state, such as tempera- 
ture and entropy? by an equation of state? 

Our object in this note is to point out the possibility of affirma- 
tive answers by sketching a construction of statistical mechanics 
on top of a phenomenological population dynamics taken as given. 
The tools for this are ready to hand in Gibbs ensemble theory, the 
population dynamics being that advanced by Volterra (1937). 

The Hamiltonian form given to his dynamics by Volterra is what 
in the first instance suggests a statistical development imitating 
that familiar in physics.* While such a program can be executed 
in principle it is in practice formidable and not very profitable. 
This occurs because of a certain artificiality of the Hamilton 
formulation. The starting Volterra differential equations in the 
population numbers, N,, of the different species r in biological 
associations are of the first order; these Volterra then makes into 


aX 
second-order equations by writing N, mete and subsequently in- 


troducing a Lagrangian and other apparatus of classical mechanics 
to arrive again at first-order Hamiltonian equations. These are 
twice as numerous as the starting equations, but of course half of 
the ‘‘constants of the motion’’ other than the Hamiltonian are al- 
ready known. That is, the final differential system comes into 
existence with a large, but false, amount of information embedded 
in it. 

In studying now the motion of the system point in phase space 
the extraneous constants of motion are a heavy burden of con- 
straints, leading in this context to appreciable mathematical dif- 
ficulties. Moreover in erecting a statistical mechanics a separate 
‘“‘temperature’’ must be introduced for each of the many constants 
of motion (Grad, 1952) so as to accord them in the statistics their 


*S. Takenaka (1941) has observed that Liouville’s theorem holds in 
Volterra’s Hamiltonian formulation; this is just a well-known property of 
the formulation itself. 


124 EDWARD H. KERNER 


due weight as elements of knowledge of the system. Thus the 
statistical mechanics is cluttered with at least as many statistical 
parameters as degrees of freedom in the original first-order dif- 
ferential equations; and the purpose of a statistical inquiry is very 
nearly defeated. The Hamiltonian rendition, in other words, forces 
us to survey our genuine ignorance through a tangle of confusing 
and trivial information. Finally, disregarding the excess constants 
of motion, the Volterra Hamiltonian is structurally complex enough 
to discourage statistical considerations around it alone, lacking in 
particular the important feature of being a ‘‘sum-function’’ 
(Khinchin, 1949). 

To proceed at once from the starting Volterra equations is there- 
fore plainly desirable, if not absolutely necessary. Now, :the sta- 
tistical mechanics customary in physics, that form of it elaborated 
by J. W. Gibbs (1902), rests on the Hamiltonian form of the equa- 
tions of motion only weakly, almost incidentally, the role of Hamil- 
ton’s equations being to make evident the two corner-stones of the 
statistical development: Liouville’s theorem and energy conserva- 
tion. It will appear that the initial Volterra equations readily 
admit a Liouville’s theorem and a universal constant of the ‘‘mo- 
tion’? somewhat like the Hamiltonian of classical dynamics; and 
then a statistical analysis of some simplicity, parallel to Gibbs’, 
becomes feasible. Herewith we find a lesson for physics as well 
as from physics, an example of how much broader is the statistical 
side of statistical mechanics than the mechanics which calls it 
into existence. 

There are, clearly, important objections to this proposed pro- 
gram. The description of interacting biological species offered by 
Volterra is surely only an approximation, probably quite crude, to 
a very complicated state of affairs, and it may seem improperly 
speculative to build further on it. However the Volterra equations 
contain at least qualitatively some important biological truths, and 
in certain cases a reasonably accurate depiction of experimental 
findings. Conceivably the position is roughly like that in statisti- 
cal mechanics or kinetic theory when based on the highly idealized 
picture of atoms as small Newtonian billiard balls, the picture 
being not so much incorrect as incomplete, but adequate to give 
valid concepts and results. It must not be forgotten of course that 
the equations are already statistical in character, as remarked 
earlier, so that in using Gibbs’ ensembles on top of them we are 


INTERACTING BIOLOGICAL SPECIES 125 


really contemplating an ensemble of ensembles—some kind of 
grand ensemble. 

More significant perhaps is the objection that no useful purpose 
is served by a statistical Volterra mechanics, that there is no need 
for it. For we are not faced in the population-biology, as we are 
in physics, by macroscopic observables and laws which make com- 
pelling an explanation of how they are actuated in terms of the 
microscopic variables. That is, the population numbers, -N,, in a 
biological association, or a few of them, are the data which are 
experimentally determined, and these are the microscopic variables 
themselves in the proposed scheme. This puts us in the position 
we would be in in physics if our observations on a gas consisted 
in measurements of the coordinates or momenta of a few of the gas 
atoms rather than the gas temperature or pressure. 

The analogy here gives some answer to the objection posed: the 
non-observation of macroscopic variables and laws does not nec- 
essarily mean that they do not exist or that it is pointless to in- 
vent or discover them. One can easily visualize in the case of a 
detailed knowledge of the positions of a few gas atoms that there 
remains point to the introduction of thermodynamic concepts. In 
effect the behavior of a few microscopic coordinates samples and 
bears the impress of the operation of the larger system in which 
they are immersed; in this sense the large system is observable 
and its statistical workings of legitimate concern. 

Though the following considerations have no perfectly firm founda- 
tion, and their usefulness be only partly apparent at this juncture, 
it is hoped that they may exhibit, however crudely, some possi- 
bilities for a fresh mode of understanding of biological association. 


2. Volterra’s Mechanics. 
The equations proposed by Volterra to describe the behavior in 
time of n biological species in interaction, having populations 


Wags 6205 Nny ALO 
aN, 
dt 


1 
aLieege a, NsNr (1) 


The first term on the right-hand side expresses how each species 
propagates if left to itself in a given environment and no other 
species interacts with it. It provides an exponential fall or rise 
of N, in time according as the coefficient of self-accretion €, (na- 


126 EDWARD H. KERNER 


tural birth minus death rate) is negative or positive. The remaining 
terms express the interaction of species 7 with all other species 
s, stating that the increase or decrease of N, per unit time is ef- 
fectively proportional to the number of encounters per second be- 
tween r and any s, ‘taken to be measured by the product N,N,. To 
account for the one-sided nature of the encounters, wherein if r 
gains because of the encounter then s must lose, the interaction 
ae! X.p are spears ga. Ops =-A45,- The positive quan- 
tities By * are Volterra’s ‘‘equivalent numbers” such that in the 
binary encounters r,s the ratio of the number of s’s lost (or gained) 
per second to the number of r’s gained (or lost) per second is 
bili pa's 

A particular interest attaches to the stationary states of the 
biological association, those for which all dN,/dt vanish and for 
which the population numbers N, have the steady values q, defined 
by 


€rBr + ” srs = 0. (2) 


At some cost in generality we shall assume these equations to 
have a unique solution with all g, positive, possible only if the 
number of species is even and if all «, do not have the same sign. 
This is perhaps the most interesting and important case. The more 
general cases need a separate investigation outside the scope of 
the present one. 

Volterra has shown in an elegant discussion that the N, are vari- 
able between finite positive limits; that at least some, often all of 
them, ‘fluctuate continually without damping out; and that their time 
averages are the steady values g, and so are independent of their 
initial values. These characteristics are most congenial to our 
aims. 

Let us rewrite equation (1) as 


B, aN, 


Ne ae. = €,B, + 2, rN (3) 


and then introduce ¢,8, from equation (2) and also the new dep end- 
ent variables 


N N 
vp =log —5 N, =qre”?, v, =log —. 


qr qs 


INTERACTING BIOLOGICAL SPECIES 127 
This gives the equations of ‘‘motion’’ 


adv 


Bror=)\ Osrgs(e”s =); v=—, 
- dt 


(4) 


in the form we shall use. 

An omission in the above equations is a self-interaction term, 
V2, of the Verhulst-Pearl type. Such terms have been shown 
quite generally by Volterra (1931) to give a kind of frictional damp- 
ing of the otherwise undamped oscillations about the stationary 
state. The system dies down to this state eventually; the remote 
future of the system is foreseeable; and the scope for a statistical 
analysis is much narrowed. Also, with these terms no constant of 
the motion is available. We continue to assume them to be negli- 
gible, limiting ourselves to systems showing bounded and un- 
damped motions (‘‘conservative’’ system in Volterra’s phrase). 
The restriction to even numbers of species seems artificial, but as 
Volterra has noted it is probable that uneven systems decay into 
even ones. 


3. Statistical Mechanics; Microcanonicadl Ensembles. 

The purpose of having introduced the particular variables v,is 
to secure a Liouville’s theorem. Consider a large number of cop- 
ies, a Gibbs ensemble of biological associations each of the same 
character and each controlled by the same differential equations 
(4) but having all variety of initial values of v,. In the Cartesian 
space of the v, (phase space) the configuration of each copy is 
represented by a point,: the ensemble by an ensemble of points. 
The points are propelled in phase space by the motional equations 
(4). When taken to be sufficiently numerous the points constitute 
a fluid of, say, density p(v1,2,---,Un) at a point (v1, V2,---, Un)s 
and velocity V =(%4, 42,..-,%n,) at this point. Since fluid is 
neither created nor destroyed we must have the hydrodynamical 
equation of continuity 


128 EDWARD H. KERNER 


But the latter sum vanished according to equation (4), which tells 
that %3,is independent of %(%,,=0). There follows Liouville’s 
theorem of the conservation of density in phase, 


eee Peklies 5 
DE ot £22 “T8y. , ©) 


stating that as one goes along with the motion of one system point 
the density in its neighborhood remains invariable. In particular 
it may be noted (Tolman, 1938) that a constant density of phase 
points (uniform ensemble) stays constant; there is no tendency of 
the motional equations to enrich one part of phase space over 
another. Another consequence, or equivalence of Liouville’s 
theorem, is Gibbs’ principle of conservation of extension in phase, 
to the effect that an element of volume of phase space, though 
changing its shape, maintains a uniform size as the motions of its 
points unfold, so long as its boundaries are marked by the same 
points. 

There are variables other than v, providing a Liouville’s the- 
orem; in fact a large class of them amongst which something re- 
sembling the transformation theory of dynamics may be built. But 
for our present purposes the 2, suffice. A helpful feature of their 
definition is that their range of variability, unlike the N,, is over 
all positive and negative numbers. 

Next we reconstruct in v language an important integral of the 
motion introduced by Volterra. In equation (4) multiply throughout 
by g,(e*r—1) and sum over all r. Because of the antisymmetry 
of the ,, there is left only 


> | Brae o-(e’r -1) =0. 


With r =8,9, for convenience, an integration gives 


G =) t,(e”? — v,) = constant. (6) 


This is the only general integral that is visible. Itis a universal, 
single-valued constant of the motion. That it is a sum of terms 
relating to the separate species in association is a considerable 
advantage, allowing a natural specification of the ‘‘components”’ 
of the system in the sense usual in statistical mechanics. We 
shall call G, the members (e”"—¥v,) and refer to ‘tthe G’’ of an 


INTERACTING BIOLOGICAL SPECIES 129 


association or of a few, or one, of its member species. The equa- 
tions of motion may be written ‘‘canonically’’ in terms of @ as 


) a 
= (Sy. i) G, sp = 3B. =e vr) ’ 
which are loosely reminiscent of the Hamilton equations. 

Each G, has the minimum value ;, occuring for yv,= 0, and in- 
creases monotonically for v, increasing positively or negatively; 
the total G = ¥q@_ has the absolute minimum value 2% (by a slight 
change of the variables ¥,,it could be arranged that the minimum G 
is zero). A surface of constant G in phase has all the necessary 
mathematical properties we need; it encloses a simply-connected 
region of finite volume and is as smooth as wanted. 

We define according to Gibbs an ensemble in statistical equi- 
librium (stationary ensemble) as one for which dp/dt=0. The 
properties of such ensembles are then the same at all times. Of 
special interest are ensembles with densities which are functions 
of G alone. These are stationary and satisfy Liouville’s equation 
(5) as may be directly verified. The ensemble average of any func- 


tion f(v1, ¥2,---, Un) of phase coordinates is defined to be 
z_ Jpfdr 
i= ’ (7) 
Spar 


the integrals being over all of phase space. p is thus of the nature 
of a probability density. 

Let us admit at this point the fundamental statistical hypothesis 
that, for purposes of finding expected values of variables of inter- 
est for an association about which there is only limited knowledge, 
equal extensions in phase corresponding equally well to this 
knowledge be assigned equal a priori probabilities. This is simply 
to say that for a statistical survey we contemplate all possible cop- 
ies of a system compatible with what information we have about it, 
and in ignorance beyond this point weigh all copies equally; the 
phase space appropriately populated with system points is then 
just machinery for conducting the survey. It may be noted that a 
separate statistical hypothesis is not needed for some purposes if 
an ergodic theorem, ensuring the equality of time averages over a 
single system and phase averages over a suitable ensemble of 
systems, can be established (ter Haar 1954, 1955). We shall later 
have an indication that there may be such a theorem. 


130 EDWARD H. KERNER 


Suppose now that our knowledge of a biological association is 
only that its G is constant at some value Go. Then to equal re- 
gions on the surface G = G we give equal probability, that is, we 
chocse 


P = Po 5(G 7 Go), 


where 5 stands for the delta function, :zero everywhere except at 
the point where its argume@t vanishes and there so large that an 
integral of 5 over a region containing this point is unity (po is an 
unimportant numerical constant). This defines the microcanonical 
ensemble of Gibbs. In a well-known computation we may represent 
an element of volume dr as dSdn = element of area on a surface of 
constant G x increment of length normal to the surface; the latter 
is dG/|VG| where dG is the difference in G-values of two neigh- 
boring constant-G surfaces; thence equation (7) specializes to 


_ dG ‘ adSdG 
= ot et Ps e Sesh 
j f 18° oe Wat wns o) a 


wf twat | hn var ie 
(rat -Z(G5) ) 


The integrals are surface integrals over G = Gp. 
Taking for example f to be 


0G N, N 
ae = = —=— 1 l : 
Fr rT dv, Tr ( og a 


we note first that since 
VG 5 oe Val a 


(% denoting a unit vector in the » direction and % a unit normal 
vector to the surface G = constant), the direction cosines of fA are 


fo Samia \Va@| 


INTERACTING BIOLOGICAL SPECIES 131 


so that 
ag ds Aas 
v;, dv, [VG] = ,(n-0,) dS =n-v,dS 
[v, denoting the vector to the point (0,..., 0, a, 0,.... O)]. Then, 


calling the denominator in equation (8) Ao, the ensemble average 


of T, is 
T, = — | #-v,ds = — [div v,dr = — 
Ao Ao Ao 


for all r, here we have used Gauss’ divergence theorem and repre- 
sented the volume enclosed by G, as 1. 

This result is analogous to the equipartition theorem of physics. 
In words, the mean T for any species is the same as for any other; 
or, the total T of the biological association in the mean is equally 
distributed amongst all species. Were the association ergodic, 
*‘mean’’ here would refer to ‘‘time average for one association.”’ 
Unlike the situation in physics, ‘this equipartition result should be 
susceptible of experimental test. The quantity T will be recog- 
nized to be a loose analog of kinetic energy. 

Another interesting result follows from the calculation of the 
average of 


By a manipulation similar to the previous one we get 


=< rf ab 
D. = — 12:9.d5 = — | div @,dr=0. 
f ih a =I) ; 


The ensemble average of N, is q;- 
As has been mentioned, the time average of N, is also g, This 
follows at once from equation (3) upon integrating from time 0 to 


time @, 
Be, M(t) ieee: 
ee N,(0) = é6; + Da ar (7 [ s (t) i) ’ 


132 EDWARD H. KERNER 


whence taking ¢ large and remembering that N, remains bounded, it 
is seen that 


ee. 
time averages of the N, = lim A if N, (@) dt 
0 


t oo 


all satisfy the same equation (2) as the gs- 

The suggestion is that the system under study may be ergodic, a 
single system point in general travelling over the surface G,) com- 
prehensively enough as eventually to cover nearly all parts of it, 
so that looking at it for a long time is tantamount to looking at a 
Gibbs ensemble covering the surface at one time. The suggestion 
is strengthened by a glance at Poincaré’s recurrence theorem in 
mechanics (Chandrasekhar, 1943) stating that under quite general 
conditions, satisfied here, the system point starting at a given 
point will wander back to this point arbitrarily closely (not exactly) 
and infinitely often. This seems at least necessary for ergodicity; 
otherwise there would be regions of phase space that would be 
avoided relative to others during the long-time motion. 

A valuable consequence of Poincaré’s theorem in the present 
context is that a/l species, not just some as proven by Volterra, 
must in general exhibit undamped oscillations; for the only alter- 
native to continual oscillation is a tendency of the N toward finite 
limits, and such a tendency in any WN clearly will not permit the 
system point to return near a given starting poin*. In Volterra’s 
discussion of small oscillations of an association about its sta- 
tionary state is found a case in point. 

It will be evident, from the fact that experimental data relate to 
perhaps a few selected species whose populations are observed as 
a function of time, that an ergodic theorem in the population dy- 
namics has even a more important position than in classical me- 
chanics, We shall adopt the surmise that biological associations 


are ergodic. 


4. Canonical Ensembles. 

We may ask now about the behavior of a part, or component, .con- 
sisting in, say, only v of the total n species, of an association. 
The component does not have its G constant throughout time but 
exchanges G with the rest of the association, only the total G 
being conserved. Corresponding to the points on the surface G = @ 


INTERACTING BIOLOGICAL SPECIES 133 


in the microcanonical distribution are points in the sub-space of 
dimension v representing configurations of the component. 

How are these component points distributed? The answer is a 
basic proposition in statistical mechanics: they are distributed 
according to the law 


defining Gibbs’ canonical ensemble. The factor (Gibbs’ phase 
integral) 


is just such as to normalize the distribution, 


fr dt, = me 


P,(%1,V%9,--+, ¥,) dr, represents the probability that a member of 
the ensemble (which is in statistical equilibrium) chosen at random 
will be found in the volume element dr, around (v,,%5,---, V,)- 


For the mean value of any function of phase we have 


f - fie, ary, 


The distribution is characterized by the constant 0, its modulus, 
rather than by G@ (which here is not constant) as in the micro- 
canonical distribution. 

The importance of the canonical ensemble in physics comes from 
the fact that it is a representative ensemble with a capacity for 
describing not isolated systems with a fixed energy but those which 
are in thermal equilibrium with their surroundings, continually ex- 
changing energy with them. In the theoretical construction the 
residual system, that of n — v degrees of freedom left over from the 
original one when the component v is separated for individual 
study, holds the position of being the ‘‘heat bath”’ in which the 
component is immersed. The modulus @ represents the thermody- 
namic temperature, and y the free energy of a system in thermo- 
dynamic equilibrium. Through this same door we enter into a 


134 EDWARD H. KERNER 


‘thermodynamic’ description of biological association. It is in- 
teresting to see that in our case the decomposability of @ does not 
lead to the awkwardness of introducing some ‘“‘small’’ @ of inter- 
action between components to provide G-exchange between them, 
as occurs in mechanics when the Hamiltonian is split into parts. 
This is inherent in the Volterra equations; the rigorous separation 
into components in general does not keep them from interacting. It 
is not implied of course that a weak interaction cannot be intro- 
duced between otherwise noninteracting associations. 

Another important feature of the canonical ensemble is that 
when the number of degrees of freedom of its members is large a 
great preponderance of them have G’s in the immediate vicinity of 
G (the canonical mean G), and, not unexpectedly, canonical aver- 
ages are substantially the same as microcanonical ones on the 
surface G=G. This is easy to prove in a familiar computation if 
G is sufficiently small, when 


G, = 1,(e"" = 0,) = rr (1+ 5%) 


and G = constant is an ellipsoid; for larger G an approximate com- 
putation shows the same thing (effectively what is involved is a 
sufficiently rapid ascent of the volume enclosed by successively 
larger constant-G surfaces). The canonical ensemble thus also 
has the aspect of a mathematical strategem to simplify the calcu- 
lation of microcanonical averages. Under the adopted ergodic 
hypothesis we can then regard canonical averages as time aver- 
ages also. Of course the number of biological degrees of freedom 
seldom will rival the number of mechanical ones commonly en- 
countered. But it may nonetheless be substantial; when insub- 
stantial the canonical approximation to the microcanonical en- 
semble weakens and a more precise analysis of component systems 
is needed; or, for isolated systems with an insubstantial number of 
degrees of freedom the microcanonical ensemble, and the thermo- 
dynamic description provided by it, may be employed. 

We must in any event emphasize the importance in the present 
statistical mechanics of fluctuations from expectation values. It 
will be appreciated also that time averages and relaxation times 
for attainment of equilibrium, practically speaking, refer here to 
intervals of time of a completely different order of magnitude than 
those frequently met in physics. 


INTERACTING BIOLOGICAL SPECIES 135 


fle ea now the subscript v we compute the canonical average 
O ry 


a -5 / -% 
D, = [ e 8 ap ve ° dr 
Ov, 
G G 
(Hand i 
Rs 6 
eae dv, A 9 dv, 


ce) 
ees cea [. ° dy, 


= (). 


As before, ‘the mean N,is appropriately g,. Similarly, for the aver- 
age of 7, we find 


pres / _& 
T= |v e ° dr fe? ar 


Ov, 


| oe Sr _& 
-{-» ope af}: +0 fe Fo / ° de, 


This gives not only the earlier result of equipartition of T but an 
insight into the meaning of the ‘temperature’ 6 of biological 


association. 
A more perspicuous view of 6 comes from the average of D2; 


eesti Fyeik / 2 &, 
af (M1) -{ (2) e ° dr feta 
Tr Ur 
ros 


136 EDWARD H. KERNER 


7G, OG, i 
but, since a0? ae + 1,, this becomes 
Cr Ge 
D2 = (far dv, eae) + Or, 
Ov, 
= OF 


or, for all species r, 


DE 
a2 


(A, es N,) : 
t SS 


Pray, = qr): 
Gr 


In other words the temperature measures, in one number common 
to all species, the mean square deviations of the populations from 
their stationary values qg,, and vice-versa. Zero temperature cor- 
responds to the completely ‘‘quiet’’ stationary state of biological 
association. The temperature is, ‘so to speak,:a kind of indicator 
of the level of excitation of the association from its stationary 
state. Its greater significance is,:according to an established 


theorem, ‘that it tells the preferred direction of flow of G from one 
association to another weakly coupled to it: on the average the 


association with higher @ will lose @ and decrease its 6, and in- 
versely for the low-@ association. This result, perhaps in the 
quantitative form following from a knowledge of the “‘heat’’ ca- 
pacities (see below), but at least qualitatively, may be amenable 
to experimental test. 

We might perhaps here conveniently introduce a definition of the 
thermodynamic state of equilibrium of an association as that for 
which the mean D,?/r, has the same value for all species. 

Together with &7,, the 2 D?/r, is partitioned equally among all 
species on the average. 

The Gibbs phase integral is, with « =F 


= I(r, a)~ 7" “T(r, a= Z, 


INTERACTING BIOLOGICAL SPECIES 137 


What is the probability that one species will have its Vy iN %,, 
v,+dv,;? From either an integration over all coordinates but Vr, OF 
from taking a component to be the one species 7, this is 


G, 
e] 
é dv 
P.dv, = “, 
r 
or, in terms of n,= N,/q,- 
eiriowns YEO tele 
a e Bah 
P(n,) dn, = : 


Or, ee I'(a7,) 


This is analogous to the Maxwell-Boltzmann distribution law. It 
is exactly the distribution of the ‘intrinsic abundance’ of a species 
assumed by Corbet, Fisher,: and Williams (1943; also Kendall, 
1948) to deduce the probability 


( = Wie a a 9 
c= (@) -_ 
Pi EST, Seige ear. (9) 


that a catch of individuals in time ¢ contains just z individuals, in 
the limit that 


Nis Oars 0; Lae er 
Ar 
These authors found it necessary to take this limit to comprehend 
the experimental data on catches of butterflies and moths; some 
meaning of the then obscure but important shape-determining pa- 
rameter or(& in Kendall’s notation) and of the limit ar —> 0 (@ >> r) 
here becomes clear. We may say, perhaps, that the observations 
were on species of low ‘‘intrinsic temperature’’ r compared to that 
(0) of the equilibrium state (one evolved over a long period of time) 
of the encompassing biological association; quite possibly 9 was 
bigger than a great many r’s for different species of that associa- 
tion. The remarkable success of the result (9) would seem to be 
in some measure an experimental verification of the present scheme. 
The moments of order p of n,; are 
a PF (at, +p) 


Dees EES 


I (a7,) 


3 


and in particular N,/q,=1 again. In the limit 0 + 0 (4—-~) 
these moments all are unity, expressing, as is ecessary, that 


P (ny) —> b(n - 1) 


138 EDWARD H. KERNER 
When cr, > 1 the most probable n,, call it [n,], is 
1 6 


Tr Tr 


In,JJ=1- 


which always is less than n;- For a7, < 1 the most probable n, is 
0, indeed P(n, = 0) = (a7, <1). The distribution in general as- 
signs appreciable weight to N’s less than g, This is understand- 
able in view of the fact that the surfaces of constant G have a 
relatively large lobe in the region of v, < 0 compared to the smaller 
lobe for all v,;> 0; then under the ergodic hypothesis we expect 
phases points generally to spend more time in the first region than 
the second. The importance of having introduced the “‘canonical’’ 
variables v,is here apparent. 
By ordering the species according to the sequence 


Ty & rg <1... SH, 
we may distinguish two categories. At a given temperature @ the 
lower group having 7s less than 6 have predominantly very low 
populations most probably; the upper group with 7’s exceeding @ 
most probably have appreciably higher populations. 


Turning now to the evaluation of the conventional thermodynamic 
variables we have for the ‘‘free energy’’ w 


-w &=log Z, 
' log I(r, a) 
y =) log tT A— arp ar =) bn 


or, with z,=r, O, 


Yr log I (z,) 
= S ont Miah 0b tai -wcreee ate 


T, vy 
The ‘‘internal energy (@)” is 


0 log Z 


G=- 
eled 


=dor log 7? A+ TT, — Tr P(r, X) =). G, 


G, 
— = log a, +1=9(e,), 


Tr 


INTERACTING BIOLOGICAL SPECIES 139 


where 9 (z) = d log I'(a)/dz is the digamma function. 
The “‘heat (G) capacity”’ is 


=e a? p’(a,) — 2, =)) Crs 


p(x) denoting dp/dz, the trigamma function. The ‘‘entropy’’ is 


ee a 
Be or 2 — lop. Z 
r og 5 og 


=) — 2, P(2;) + 2, + log I'(a,) =) 5,. 
The behavior of the single-species contributions to these vari- 
ae 
ables as functions of —=—or of # are shown in the figures (Fig- 
oe 


ures 1-4). All but w%,/r, increase monotonically with increasing 
temperature. C, alone tends for large 6 asymptotically to a limit, 
C, =1,—whence an analog to the Dulong-Petit law,—and comes 
to the value 5 linearly as @ vanishes, whence no analog to the 
Nernst heat theorem. S, increases in magnitude without bound in 
the limits of both small and large 6, the state of lowest (- ~) 
entropy being the stationary state of association, 6 = 0. 


In the limit of high temperatures, 


6 4 log 4 
Pata ing toe {- 0 ("s )). 


Tr Tr Tr T, 


Therefore in this limit G is equipartitioned amongst all species. 
At small 6, 


G 1¢@ 
SESS EP eee G, - 1 > —9, 
T, Tr 2 


so that 


ES =a T- min (Grmin = Tr) 


is equipartitioned. Altogether then 2 G, — Grmin 18 equipartitioned 


140 EDWARD H. KERNER 


6 

T 

15 

10 

5 

) 

) 5 10 15 

g 
T 


FIGURE la. Single-species contribution to internal energy, giving G as 
function of association temperature @ for fixed intrinsic temperature fr. 


3 


Ola! 


o|A 


FIGURE 1b. Single-species G as function of r for fixed 6 


INTERACTING BIOLOGICAL SPECIES 141 


for low temperatures at + 6 per degree of freedom and for high 
temperatures at 0 per degree of freedom. We may note also that tho 
general rule of mixing, 


6 en au Oo 
[ca a0 = Ga(6) ~ Ca(01) = G5(00) ~Ga() = [Ce a, 
6, “6 

giving the final equilibrium temperature 0 of two equilibrium as- 
sociations A and B at initial temperatures 0; and 02> 06; when 
placed in weak interaction, has the corollary 


Sy Ie 6 
Sec 3 ena i ee 


vp + V4 


for all temperatures sufficiently high or all sufficiently low (v4 and 
v, denoting the number of species in each association). 

The entropy, .as in conventional statistical mechanics, measures 
higher for systems in equilibrium, represented by the canonical 
ensemble, than the corresponding quantity (— log p) for other states 
of the systems (represented by other than canonical ensembles) 


ro) 


142 EDWARD H. KERNER 


o|l-<= 


15 20 


FIGURE 2b. Free energy per species for fixed 6 and variable 7. 


having the same mean G. And in the well-known sense of Gibbs’ 
coarse-grained view of the density in phase, non-equilibrium states 
tend to decline into equilibrium ones of maximal entropy. It must 
not be forgotton here that, because the number of degrees of free- 
dom in a biological association is not so enormously large as that 
of physical systems studied by the same methods, the tendency 
toward equilibrium may be expected to be somewhat obscured by 
noticeable fluctuations. 

We have thermodynamic variables but no thermodynamics as yet. 
The previous considerations are really calorimetric rather than 
thermodynamic in character, as the only independent variable of 
state is the temperature @ and the only process contemplated is 
“‘heat’’ (G) transfer from one system to another. The thing that is 
missing is the concept of work. 

In mechanics the Hamiltonian is laden with ‘‘external parameters” 
such as the volume of the system and strengths of external gravita- 
tional or other fields acting on its parts; when altered these pro- 


INTERACTING BIOLOGICAL SPECIES 143 


C 
1.0 


oO 


5 10 15 20 


FIGURE 3. Heat capacity per species as function of association 
temperature @. 


S 

4 

3 

2 

0 

ue 5 10 15 20 
8 
T 


FIGURE 4. Entropy per species as function of association temperature 6. 


144 EDWARD H. KERNER 


duce alterations in the state of the system, in fact they alter the 
system itself, and so are variables of state. What, in Schrédinger’s 
phrase (1952), are these ‘‘screws, pistons, and what not’’ by which 
we can squeeze on a biological association? What indeed are the 
observables other than the population numbers? They are the real 
physical and chemical variables of the miliew exterieur: physical 
temperature, pressure, radiation and other field strengths, : and 
chemical abundances. Physical temperature particularly has a pre- 
eminent place, being like the volume in thermodynamics, a univer- 
sal type of external parameter and playing a universally important 
role. 

Consider therefore that the +r, are functions of these exterior 
variables, ‘say a;, and through them so is the G of an association. 
In the customary way, define the generalized forces, 


Ee ow OW, aia 
Ff; =-—=- =) F; 
; 0a; ~ 0a; 2 sd 
(10) 
Discs Or, 
ir =—- z (log z, + 1- 9(z,)), 
and plainly 
ae (aes 
—=-—} ir = — G, log r,. 
fs Or, Qj 
0a; 


Equations 10 are loosely analogous to Dalton’s law of partial 
pressures and constitute a system of equations of state. If there 
be a single equation of state it is perhaps equation (10) with a; = 
physical temperature 7. When the association temperature 6 is 
large, because of the equipartition of G@ we have 


aaa re] 


For the sake of illustration only we might suppose, ‘in a crude but 
not impossible approximation that each 7,(7) = b,T~4* over some 


INTERACTING BIOLOGICAL SPECIES 145 


limited range of 7; then the equation of state reads 


F,T = v8, ( =p a), 


v being the number of degrees of freedom in the association. This 
is analogous to the gas law PV =NkT, with F'7,T7,@ the analogs 
of P,V,T, though F;, of course is not a single datum of direct ex- 
perience as is P. 

It is evident at this point that we are carried into thermodynam- 
ics proper but without an adequate experimental frame of reference. 
To proceed further into a discussion of cycles (like Carnot’s) in a 
T,@ plane, of transport processes (like conduction of G in physical 
Space, under some kind of Fourier conduction law), in short to 
elaborate further the interplay in the large of the biological and 
physical worlds, seems perhaps possible but now premature. 


I am indebted to Dr. L. W. Phillips for his steady and material 
encouragement, and to Dr. W. Opechowski for a stimulating 
discussion. 


LITERATURE 


Chandrasekhar, 8S. 1943. ‘‘Stochastic Problems in Physics and Astron- 
omy.’’ Rev. Mod. Phy., 15, 1-89. 

Corbet, A. S., Fisher, R. A., and Williams, C. B. 1943. ‘*The Relation 
between the Number of Species and the Number of Individuals in a 
Random Sample of an Animal Population.’’ J. Anim. Ecol., 12, 42-58. 

Gibbs, J. W. 1902. Elementary Principles in Statistical Mechanics. New 
Haven: Yale University Press. 

Grad, H. 1952. ‘‘Statistical Mechanics of Dynamical Systems with In- 
tegrals other than Energy.’’ J. Phy. Chem., 56, 1039-1048. 

Kendall, D. G. 1948. ‘*On Some Modes of Population Growth Leading ta 
R. A. Fisher’s Logarithmic Series Distribution.’’ Biometrika, 35, 6-15. 

Khinchin, A. I.. 1949. Mathematical Foundations of Statistical Mechan- 
tes. New York: Dover Publications. 

Lotka, A. J. 1925. Elements of Physical Biology. Baltimore: Williams 
and Wilkins Co.. ; 

Schrédinger, E. 1952. Statistical Thermodynamics. Cambridge: Uni- 
versity Press. ; : onde 

Takenake, S. 1941. ‘‘Uber die Volterrasche Biologische Dynamik. 
Jap. Jour. Med, Sci., Part tii, Biophysics, 7, 129-140. 

ter Haar, D. 1954. Elements of Statistical Mechanics, New York: Rine- 
hart and Co.. 

. 1955. ‘Foundations of Statistical Mechanics.”? ev. Mod. 

Phy., 27, 289-338. 


148 EDWARD H. KERNER 


Tolman, R. C. 1938. The Principles of Statistical Mechanics. Oxford: 


Clarendon Press. 
Volterra, V. 1931. Legons sur la Théorie Mathématique de la Lutte pour 


la Vie. Paris: Gauthier-Villars. 
1937. ‘‘Principes de Biologie Mathématique.’? Acta Bio- 


theoretioa, 3, 1-36. 


RECEIVED 10-10-56 


MATHEMATICAL B OP 
BIOPHYSI 
VOLUME 19, 198 Se 


FACTORS IN VISUAL ACUITY: I. NEURAL INHIBITION 
AND THE VISUAL PERCEPTION OF CONTOURS 


PETER H. GREENE 


COMMITTEE ON MATHEMATICAL BIOLOGY 
THE UNIVERSITY OF CHICAGO* 


Interpretations of the mechanisms of perception of contours or of Mach 
bands have stressed either the role of various spatial derivatives of light 
intensity at the retina or the importance of various forms of inhibitory 
effects between neighboring retinal elements. Evidence is presented 
here in support of the latter type of interpretation. It is considered that 
the brightness contrast and perceived contours arise from neural ele- 
ments, each of which is stimulated in proportion to the intensity of photo- 
receptor excitation at a point of the retina and inhibited in proportion to 
the mean intensity in some neighborhood of that point. The role of the 
spatial derivatives is best seen as a particular manifestation of the in- 
hibitory mechanism. Predictions based upon this hypothesis appear to be 
consistent with experimentally observed evidence. 


The present investigation is part of an exploration of some 
mechanisms frequently considered by physiologists in attempts to 
account for the overwhelming superiority of visual acuity over that 
which might be expected of the gross optical construction of the eye. 
An overall view of such mechanisms is found in the frequently 
cited paper of W. H. Marshall and S. A. Talbot (1942) which in 
particular emphasizes the importance of inhibitory neural cross- 
connections and of the response of the retina to stimulus gradients 
as they are swept back and forth across the photoreceptors by the 
natural flutter of the eye. These considerations, in connection 


*This research was supported by the United States Air Force through 
the Air Force Office of Scientific Research of the Air Research and De- 
velopment Command under Contract No. AF 18 (600)—1454. Reproduction 
in whole or in part is permitted for any purpose of the United States 


Government. 


147 


148 PETER H. GREENE 


with the presence of on- and off-receptors, have suggested to 
many investigators the importance of various spatial derivatives of 
the stimulus intensity in determining contour acuity. Chief con- 
sideration has been accorded to the stimulus gradient, or first 
derivative. However, E. Ludvigh (1953), studying the perception 
of one-dimensional light intensity patterns which produce Mach 
bands, emphasized the seemingly greater role of various higher 
spatial derivatives of the stimulus intensity in the perception of 
sharp contours in these patterns. Ludvigh’s work differed from that 
of other investigators in that he systematically varied individual 
higher derivatives of the actual retinal light distribution. 

This paper will present evidence supporting the hypothesis that 
Ludvigh’s results are most clearly seen as a manifestation of the 
presence of inhibitory cross-connections which cause a neural re- 
sponse at any point of the retina to be inhibited by the total neural 
response within a neighborhood of the point. This is the interpre- 
tation of the Mach bands given by G. A. Fry (1948) and suggested 
by K. Koffka and M. R. Harrower (1931). 

Ludvigh’s observations are illustrated in Figure 1. His stimulus 
patterns, plotted as stimulus energy (£) against position (z) on the 


retina, consisted of sigmoid curves having various known amounts 
of maximum curvature, as measured by E”’ and E(év), the second 


PHYSICAL STIMULUS 
INTENSITY 


SUBJECTIVE 


BRIGHTNESS 
(QUALITATIVE ) D, D 


FIGURE 1. Typical perception of one of Ludvigh’s one-dimensional 
patterns of illumination intensity. In the upper curve the ordinates repre- 
sent retinal illumination intensity and the abscissas represent linear 
distance along the pattern projected on the retina. The lower curve is a 
similar plot of the subjective brightness. If bend A is sharper, then C 
and Cg are closer together. If B is sharper, D; and Do are sioner If 
and B are extremely sharp, Dg and C1 may fuse. 


FACTORS IN VISUAL ACUITY 149 


AVERAGING 
DIAMETER a 


E-E AFTER — 
THRESHOLD 
FIGURE 2. Response of the contour mechanism. Ordinates represent 


the various intensities indicated; abscissas represent distance along the 
pattern. 


and fourth derivatives of & with respeci to distance x. The per- 
ception of contours was in general relatively independent of the 
values of £ and its first two derivatives at the loci of the perceived 
boundaries. A pair of edges (such as C, and Cy or D, and D2) ap- 
peared to be symmetrically disposed about each maximum or mini- 
mum of E’’, and the separation between the two edges of each such 
pair decreased when E© was experimentally increased. Roughly 
speaking, a narrow edge-doublet formed about a sharp bend and a 
wide one about a region of more widely distributed curvature. 

In order to account for some of these results, a familiar type of 
retinal neural interaction will be considered and a special case 
examined. The mechanism responsible for the perception of con- 
tours in Ludvigh’s experiments will be supposed to be composed of 
two kinds of retinal element: first, a set of elements which meas- 
ure the photoreceptor excitation £; and second, a set of more dif- 
fusely connected overlapping elements which measure £, the aver- 
age of E over a region of diameter a. The two units at any particu- 
lar point of the retina interact, perhaps at the ganglion cell, and 
the second inhibits the first, so that the excitation contributed by 
the two units is something like E -E. It is conceivable that the 
two units might involve the midget bipolars and the more diffusely 
connected bipolars, respectively, but it is also conceivable that 
the postulated inhibition might occur further along in the visual 
pathway. Other possible forms of inhibition have been considered 
by Fry (loc. cit.). 

We know the response of such a mechanism to the stimuli repre- 
sented by Ludvigh’s sigmoid intensity versus retinal distance 
curves. At bend A in Figure 2 the intensity E is greater than its 


150 PETER H. GREENE 


average in the neighborhood, and the ‘*contour-response’’ FE — E 
has a positive hump in this region. At bend B the response is 
represented by a negative depression. At all other points E =E, 
and the contour-response is zero. If a threshold exists, the re- 
sponse will be discontinuous at each side of a sufficiently pro- 
nounced maximum or minimum of E —F, and edges will be per- 
ceived at these points of discontinuity. 

One feature of such a mechanism is that the contour-response, 
added to a relatively imprecise determination of the intensity E at 
a few points, would provide a fairly good specification of the gen- 
eral nature of the intensity pattern without the necessity of trans- 
mitting very much ‘‘redundant information’’ through the optic nerve. 

Since, in the one-dimensional case, F(z) is an average of the 
function E over an interval of width a, we have 


E(a) —E(a) = E(2) - me E(z+ dé, 


-a/2 


and expansion of E(# + €) in a Taylor series gives 


a? 


roe - a (iv) - 
E-KF=- ry - 7900 = + higher order terms, (1) 
where a, the diameter of the averaging interval, must be small if 
the approximation is to be useful. These derivatives are just the 
ones singled out by Ludvigh as being most important in the per- 
ception of contours in his experiments. 

Attempts were made, with indifferent success, to apply this ex- 
tension (1) of Ludvigh’s approach to Ludvigh’s published data in 
order to predict the location of the edges. In addition, modifica- 
tions of (1) which contained odd derivatives, or which corresponded 
to plausible non-uniform averaging weight-functions, were tested 
numerically or by geometrical construction. The general conclu- 
sion was reached that the essential factor involved in edge per- 
ception was E —F and that the particular derivatives considered 
by Ludvigh derived their significance only insofar as they might 
serve to determine E —£F with accuracy. The frequent failure to 
predict observed results reflected the inadequacy of an approxima- 
tion based upon (1). Predictions based upon the second derivative 
alone, as by R. W. Burnham and J. E. Jackson (1955) give only a 
very general notion of the location of the edges of the Mach bands. 


FACTORS IN VISUAL ACUITY 151 

Rough graphical estimation of E - E corresponding to Ludvigh’s 
curves showed that in most cases the perceived edges were located 
in positions consistent with an averaging distance of 0.08mm, which 
corresponds roughly to 16 “or 64 cone widths, if the response thresh- 
old of the contour mechanism was taken as 4% relative intensity. 
For this crude approximation the response characteristic was con- 
sidered to be linear over the stimulus intensity range employed. 

In the case of one of Ludvigh’s intensity curves with very gradual 
curvature, the edges lie further from the point of maximum EF ”’ than 
would be predicted on the basis of the above approximation. This 
discrepancy would be expected in case the averaging weight- 
function assigns greater weight to points near the center of the 
averaging interval than to points near the edges of the interval. It 
will be noted that, when one end of such an averaging interval is 
near a sharp bend in the intensity curve, most of the averaging 
interval is far away from any curvature, and thus F for this interval 
differs only slightly from E. On the contrary, in the case of gradual 
curvature, if one end of the averaging interval is at a point of mod- 
erate curvature the middle of the averaging interval is likely to be 
at a region of slight curvature, and this latter curvature may con- 
tribute strongly to E — E because of the higher weight given to the 
center of the averaging interval. 

The order of magnitude of the averaging distance is all that can 
reasonably be inferred from the above crude considerations, as 
nonlinearity of response was neglected and no notice was taken of 
the possibility that unequal weights might be given to E and E in 
the process of inhibition which is considered to produce the contour- 
response. The effect of unequal weights could be distinguished 
from a change in the width of the.averaging interval only by further 
experiments. 

Let us consider some direct consequences of the hypothesis 
that the contour-response depends on FE —£. For purposes of dis- 
cussion, we shall first consider idealized rectilinear patterns with 
very sharp bends. The purpose of using these is to dissociate the 
amount of curvature and the spatial extent of the curved regions, 
which, unfortunately for these considerations, were inversely re- 
lated to each other in Ludvigh’s experiments. 

In particular, we shall see that the contour-response to a pattern 
consisting of two intersecting line segments of slopes m and n 
(Figure 3) increases (or decreases) from zero at the ends of the 


152 PETER H. GREENE 


(a) (b) 


FIGURE 3. Contour response for a rectilinear pattern. (a) represents 
stimulus; (b) represents contour-response. 


averaging interval centered about the intersection of the line seg- 
ments to ta(m-—n) at the center of the interval. Figure 3 shows 
the case where m > n; if m <n, the response will be negative. As 


Sows : a a 
before, E() is found by averaging E over the interval ( ne a+ =} 
We find in the present case 


E (2) = E(w) when |2| > _ (2) 


and 


Pea ea ctlees 


In particular. 
ae a 
EB (0) =(m+ny-, FE (0) =0, 


: (4) 
E (0) ~E(0) = (m—n). 


The following discussion gives a simple way of visualizing certain 
features of the contour-response to patterns like that of Figure 3. 

First, let us consider the special case in which the two line 
segments constituting the intensity pattern are symmetrically dis- 
posed about a vertical line through their intersection. They thus 
form the apex of an isosceles triangle whose base would be any 


FACTORS IN VISUAL ACUITY 153 


line of constant E. Let the slopes of the lines be +m (Figure 4). 
Placing n = — m in (3) we find that 


2 
Mz Ma 
, (5) 


FIGURE 5. Decomposition of #(z) into an even part and a non-effective 
odd part. Contour-responses to A and B will thus be equivalent. 


Now let us return to the more general case of the pattern il- 
lustrated in Figure 3. First we note that adding any linear function 
pa + q to E(«) will not change E — E. Let us then decompose F(z) 
into the sum of an even function and an odd function as shown in 
Figure 5. Only the even part will contribute to E - E. 

Replacing m in (2)-(5) by $ (m — n), and n by - +(m- 7), we find 


that 
2 
“ =. 2) (n=n) Tule 
Qa 8 2 
Fr. (6) 
f ae 
Ei | «| 5 
and thus 
2 
1 ig na+(Z + £)(m-m if -5 < #0, 
| Kamla Fo (7) 


154 PETER H. GREENE 


The equivalence of the patterns in curves 5A and 5B is shown more 
clearly in (7) than in the mathematically equivalent expression (3). 

Several points are apparent and should be noted. The contour- 
response depends only upon the difference between the slopes of 
the two lines, to which it is proportional, and it is always sym- 
metric with respect to the intersection of the two line segments 
composing the stimulus pattern. Because the shape of this recti- 
linear stimulus pattern does not change with varying viewing dis- 
tance, the brightness contrast between the perceived band and the 
adjacent portions of the pattern should be relatively independent 
of the viewing distance. Adding a linear gradient should not affect 
this model of the contour mechanism. However, ‘“‘tilting’’ the pat- 
tern so as to make it ‘‘steeper’’ would increase m—n and hence 
enhance the contour producing ability of the bend in the stimulus 
curve. 

The degree to which stimulus patterns can differ and yet produce 
the same contour-response is shown strikingly in Figure 6. 

These conclusions were tested in some preliminary experiments. 
The stimulus intensities E'(z) were realized as the variation of 
light-dark ratio along the length of a rapidly rotating black and 
white cylinder. 

When a stimulus pattern consisting of two intersecting line seg- 
ments (as in Figure 3) was viewed, optical errors of the eye 
smoothed the shadow projected in the retina, prior to any smoothing 
involved in the contour mechanism. In this approximate treatment 
the exact form of the smoothing weight function does not have to 
be considered. Let us consider, as did Ludvigh, averaging with a 
flat weight function over an interval z + +d. The result of the in- 
itial smoothing is given by E of (2) and (3), with d substituted for a. 
Since the resulting curve is smooth, we can get a good idea of 
E —E by substituting the smoothed E in (1). Since the smoothed 
stimulus curve is quadratic, only the first term in (1) is non-zero, 
and we see that 


m-n 


: (8) 


near the center of the interval (— +4, +4), and that E —- EF =0 out- 
side the intervals (— $d, +d) and (— +-a, 4). The proportionality 
of (8) to m~—n indicates that the unavoidable smoothing produced 
by optical errors will not essentially modify the above conclusions. 

Results of the preliminary experiments appeared consistent with 


6 se 
E (smoothed) — E(smoothed) = of 


FACTORS IN VISUAL ACUITY 155 
ie 


FIGURE 6. Contour-equivalent stimulus pat- 
terns (ordinates =relative illumination inten- 
sity, abscissas = distance on retina) are given 
by the solid lines AOA’, BOB’, COC’. The 
difference between slopes, m-—n, is propor- 
tional to 20P, because, e.g., slope BP = slope 
PB’, and by construction, slope BO = slope 
BP +(OP/r) and slope OB’=slope PB’ — 
(OP/r). 


all predictions derived above. %3ands seemed clearest (in the par- 
ticular experiments performed) when viewed from about five feet. 
At this distance two bands began to fuse when they were about 
2mm apart. From the shape of the contour-response pattern result- 
ing from a sharp bend in the stimulus pattern, we should expect the 
distinctness of two bands to be obscured when the distance be- 
tween the two bends in the stimulus pattern is about + Ge bis 
would make a@ somewhat more than 9%. (Calculations from Ludvigh’s 
data, it will be remembered, yielded an estimate of 16%) Again, 
the crudeness of the methods used allows this result to suggest 
only the order of magnitude of the averaging distance. 

Because of the rectilinearity of the stimulus patterns employed 
in this investigation, the change of scale which results from a 
change in the viewing distance has little effect upon the appear- 


156 PETER H. GREENE 


ance of small portions of the patterns. Attempts are being made to 
utilize the effects of changes in viewing distance upon the ap- 
pearance of more complex patterns for the study of the sizes and 
organization of the inhibitory regions. It is expected that further 
studies of this nature should make it possible to learn details of 
the neural mechanism discussed here from more precise experi- 
mental methods like those of Ludvigh by revealing the significant 
factors whose effects should be systematically investigated. 


The author wishes to express his gratitude to Dr. H. D. Landahl 
for reading and discussing this paper. 


LITERATURE 


Burnham, R. W. and J. E. Jackson. 1955. ‘‘Mach Rings Verified by Nu- 
merical Differentiation.’’ Science, 122, 951-3. 

Fry, G. A. 1948. ‘*‘Mechanisms Subserving Simultaneous Brightness Con- 
trast.’? Am. J. of Optom. and Arch. Amer. Acad. Optom., 25, 162~78. 
Koffka, K. and M. R. Harrower. 1931. ‘‘Colour and Organization.’’ 

Psychologische Forschung, 15, 145-275. 

Ludvigh, E. 1953a. ‘‘*Perception of Contour. I. Introduction.’ U.S. 
Naval School of Aviation Medicine, Naval Air Station, Pensacola, Fla. 
Project No. NM 001 0775.01.04 (Joint Report No. 4). 

1953b. ‘‘Perception of Contour. II. Effect of Change of Retinal 
Intensity Gradient.’? U.S. Naval School of Aviation Medicine, Naval 
Air Station, Pensacola, Fla. Project No. NM 001075.01.05 (Joint Re- 
port No. 5). 

Marshall, W. H. and S. A. Talbot. 1942. ‘*‘Recent Evidence for Neural 
Mechanisms in Vision Leading to a General Theory of Sensory 
Acuity.’”? Pp. 117-64 in Visual Mechanisms. (H. Kliiver, Ed.) Lan- 
caster, Pennsylvania: Jaques Cattell Press. 


RECEIVED 11-16-56 


BULLETIN OF 
MATHEMATICAL BIOPHYSICS 
VOLUME 19, 1957 


ON THE INTERPRETATION OF THE EFFECT OF ARFA 
ON THE CRITICAL FLICKER FREQUENCY 


H. D. LANDAHL 
COMMITTEE ON MATHEMATICAL BIOLOGY 
THE UNIVERSITY OF CHICAGO* 


The effects of area and intensity on the critical flicker frequency, 
threshold, and reaction time are considered in terms of neural net theory. 
An attempt is made to develop a mechanism which can account for the 
phenomena associated with the empirically observed laws of Ricco, 
Granit, Talbot, and Ferry-Porter as well as observations on reaction time 
and threshold. A simple model gives results which are substantially in 
agreement with observation except for a few apparent discrepancies. 
Experimental procedures are suggested which can determine whether these 


are apparent or real. 


A simple interpretation of the critical flicker frequency has heen 
given (Householder and Landahl, 1945, Chap. VI) in terms of the 
neural net theory (Rashevsky, 1948). It is the purpose of the present 
note to consider the effect of area in terms of this model. Perhaps 
the simplest model which introduces the area is that of a net in 
which some center receives collaterals from various parts of the 
retina but not necessarily the same numbers from every region. For 
simplicity we shall consider here only the case of central symmetry. 
If p is the radial distance in the retina from the central fovea, then 
27pF(p)dp is the number of elements which synapse upon theneural 
center being considered and which arise from the circular shell 
between p and p + dp. If S(z,y) is the stimulus intensity in energy 
per unit area and time at a point z,y, then the excitation ¢[S(z,¥)] 


*This research was supported by the United States Air Force through 
the Air Force Office of Scientific Research of the Air Research and De- 
velopment Command under contract No. AF 18(600)—1454. Reproduction 
in whole or in part is permitted for any purpose of the United States 


Government. 


157 


158 H. D. LANDAHL 


(Householder and Landahl, hereinafter referred to as loc. cit., Chap. 
I) becomes 


P 
Qn ii $18(2,y)] pF (p) dp (1) 
0 


P being the distance from thé center of the retina to the periphery. 
Using the expression previously obtained (loc. cit., p. 44) it is 
possible to obtain an equation for the critical flicker frequency as 
a function of size and intensity of a given stimulus pattern. It is 
found that on the basis of simple forms of the distribution function 
F, the frequency at fusion is proportional to the area for small 
stimuli and to the logarithm of the intensity. Actually it is found 
that over a range of areas and intensities the frequency is approxi- 
mately proportional to the logarithm of the area times intensity. 

Before considering this problem further we shall introduce a 
necessary modification which should have been introduced before. 
In the previous work (loc. cit.) it was assumed that the func- 
tion ¢(S) is a logarithmic function of the stimulus intensity S. For 
a rapidly flickering stimulus the mean value of the net excitation, 
o =e-—j, € and j being the excitatory and inhibitory factors, is 
proportional to the mean value of ¢. Empirically it is found that 
the response is determined by the mean value of S. This difficulty 
can be eliminated without upsetting the explanation of other phe- 
nomena if ¢ is defined in the following manner. Let S be the aver 
age value of S over a period of time considerably larger than the 
time 1/ b, 6 being the rate constant of the inhibitory process. That 
is, S may be calculated from the expression 


= t 
S= en f Se%dt, (2) 


where S is a function of time and c a constant the reciprocal of 
which is several hundred seconds. We shall then assume that #(S) 
is given by 


B(S-a), 8 
¢ (S) = —— — log — 3 
= ee nie (3) 
where £ is a constant and / a threshold value. Expression (3) may 
be considered to be an approximation to a more complex relation 
resulting from the photochemical processes which occur in the 


CRITICAL FLICKER FREQUENCY 159 


retina. If S < hk, ¢ is defined as being equal to zero so that if S is 
less than we may set S=h. Similarly if S< A, we set S=A. it 
can be seen that if S$ is considerably larger than /, the mean value 
of ¢(S) over a short time will be proportional to the mean value of 
S while for a constant stimulus which is maintained for a long time 
@ is again proportional to the logarithm of S. 

If expression (3) is used instead of the simple logarithmic rela- 
tion then the expression for the reaction time (loc. cit., Chap. VI, 
eq. 1) should involve S instead of the logarithm of S if S is less 
than the threshold A at the time that the stimulus S is applied. 
However S will be less than / only if the time which elapses be- 
tween trials is sufficient for adaptation to be quite complete. If 
not, then even though the residue is only a few times the threshold, 
being on the average approximately a constant fraction, the relation 
approaches more nearly that involving the logarithm of the stimulus. 

We consider next the situation in which a circular stimulus of 
radius PR is presented for a relatively short time to test the thresh- 
old to response. We suppose that just prior to the test S has been 
zero for a time ¢2 3/6 so that, since a> b, we may ignore the 
inhibitory effect due to j7. However, we shall permit S to take on 
any value, its value being estimated at the time ¢ = 0 when the test 
stimulus S is applied. For each neural element the excitatory fac- 
tor « will be given from é = a@ - ae, so that from (3) we find 


_ B(S- 4) 


= log = (1,- e~™). (4) 
S-h h 


The contribution to the total excitation «7 from all the excited 
elements will then be given, as in (1), by 


ACG eo era et Me ZO 5 
ep - = —— log he yon fp (pap. (8) 


If 2’ is the threshold of the neural center towards which the 
neural elements converge, we can find a relation among the vari- 
ables S, ¢, and RF for a threshold response by setting er = HW’ in (5). 
If S >> A, ¢ << 1/aand R is small enough so that F(R) = F(0), then 
jt can be seen that (5) leads to the relation SAz = constant, where 
A is the area of the stimulus. Hence Ricco’s law and the Bunsen- 
Roscoe law are satisfied. Note that the constant depends on s 

The time ¢, for the excitation «, at the neural center to reach hh’ 


160 H. D. LANDAHL 
is given by (5) with ¢ = ¢,, €7 = 2’. This expression, solved for ¢,, 
gives the reaction time ¢, as a function of stimulus intensity and 
area. 

To illustrate the results from (5) we shall consider three simple 
distributions: 


Fi(p)=(1+ 91 ays (6) 
F,(p) = (1+ gf p*)*, (7) 
Fido iste hee (8) 


If we define A as an area relative to the area of a circle whose 
radius is the space constant of the distribution, in other words 


A i iee 
A= Ae A = 79; 5 ae ee (9) 
& 
we may write (5), for any of the distributions, in the form 
ho SHh 
(A) Jat) = 7 a (10) 
where 
, R 1 he ea 
a(a)=2R? [" pF(p)dp,  I(at)= 2, ay 
- a 


A (SOLID) —> 
at (BROKEN) —> 


FIGURE 1. 


CRITICAL FLICKER FREQUENCY 161 


so that G(0)=J(0)=1. The functions J(aé) and G(A) are shown 
graphically in Figure 1. The latter function is shown for each of 
the distributions (6)-(8). It can be seen that AG(A) is propor- 
tional to the area A if the illuminated area on the retina is small. 
But AG(A) is more nearly like a fractional power of the area A for 
larger values of the latter. 

We have assumed that F(p) is essentially a continuous distribu- 
tion. Actually it would perhaps be more likely from anatomical 
considerations that the distribution is made up of a direct, local- 
ized component together with a relatively smooth distribution. In 
this event the following procedure may be useful in estimating the 
actual distribution. If we use short times only so that J is ap- 
proximately one and if we use large enough values of S so that 
S—h can be replaced by S, then putting S = A for simplicity, (5) 
may be written 


Qa R 
(sq = 20" [" pr (p) ap. (12) 
0 


But (St)~' is the sensitivity to the energy per unit area for an il- 
luminated circular area with radius R. Plotting this quantity as 
found experimentally and differentiating graphically, the function 
F(R) can be found from (12) to be given by 
Kd (St)-* 
Hp (13) 
27BakR dR 
If we consider a narrow bar of width w and length 2L which is 
fixated at the center, then the function F, in this case F(L), can 
be obtained from the expression 


(14) 


We return now to the problem of flickering light. We suppose that 
the eye has been exposed to the flickering light for a long enough 
time so that the duration has no appreciable effect. From the ex- 
pression for the critical flicker frequency given previously (doc. 
cit., p. 44, eq. 14) we have, using (3) and integrating over p, 


(a-d)Br(l-r)S-h S Leite oe ees 
wee ee eG SAG LATS 5 (1 2r) AG(A) +... (15) 


The value of § at any point which is stimulated intermittently will 


T= 


162 H. D. LANDAHL 


be rS, 7 being the fraction of time that the light is on with intensity 
S. If we stop here, the result will be essentially the same as if 
d& were assumed to be directly proportional to the logarithm of S 
except for the dependence of f on r. If there were a negligible ef- 
fect due to scattered light and also due to eye movements then ex- 
pression (15) should hold if the model is not modified. These ef- 
fects, however, are usually present under experimental conditions. 
We will show next how the critical flicker frequency may depend 
upon area under the influence of such effects by a special case in 
which the eye movements are assumed to be such that the average 
intensity is spread out radially according to an exponential func- 
tion. Assuming also that the mean intensity S at the center of the 
circular area can be used to represent the average value over the 
area, we can set S equal to rSA2n/g, as a first approximation. 
The quantity g4 is the coefficient of p in the exponential function. 
Then we find the critical flicker frequency from (15), ignoring the 
second-order term, to be given by 

(a-b) B(1—-r)ga 2nrAS 


ldg:—- = tay (16) 


ao Qmh? gah 


For this special case we see that for small areas f is propor- 
tional to the logarithm of the product of the area and intensity 
while for larger areas the dependence on area is somewhat less than 
that on intensity. Thus the result is in general in agreement with 
the Ferry-Porter law and the findings of Granit and Harper (1930). 
It should hardly be necessary to point out, however, that in the 
case of flicker phenomena the assumptions made to obtain (16) are 
substantially weaker. On the other hand, if it could be shown that 
the empirical relation approaches that of (15) when the effects of 
eye movements are eliminated, this would strengthen the model. 


The author wishes to express his appreciation to Dr. S. Kamiya 
for reading and discussing this paper. 


LITERATURE 


Granit, R., and P. Harper. 1930. ‘‘Comparative Studies on the Peri- 
pheral and Central Retina. II.’’? Am. J. Phystol., 95, 211-28. 
Householder, A. S., and H. D. Landahl. 1945. Mathematical Biophysics 
ef the Central Nervous System. Bloomington, Indiana; The Principia 
ress. 
Rashevsky, N. 1948. Mathematical Biophysics. Rev. Ed. Chicago: The 
University of Chicago Press. 


RECEIVED 3-23-57 


